All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] serial: 8250: add lock for dma rx
       [not found] <20211209073339.21694-1-wigin.zeng@dji.com>
@ 2021-12-09  7:38 ` Greg KH
  2021-12-09  8:15   ` 答复: " wigin zeng
  0 siblings, 1 reply; 12+ messages in thread
From: Greg KH @ 2021-12-09  7:38 UTC (permalink / raw)
  To: wigin.zeng; +Cc: jirislaby, linux-serial, linux-kernel

On Thu, Dec 09, 2021 at 03:33:39PM +0800, wigin.zeng wrote:
> Need to add lock to protect the tty buffer in dma rx handler and serial
> interrupt handler, there is chance that serial handler and dma handler
> executing in same time in multi cores and RT enabled scenario.

Are you sure?  Why has this not been a problem before now?  What
changed?

> Signed-off-by: wigin.zeng <wigin.zeng@dji.com>

I do not think you have a "." in the name you use to sign documents,
right?  Please use your real name here.


> ---
>  drivers/tty/serial/8250/8250_dma.c  | 2 ++
>  drivers/tty/serial/8250/8250_port.c | 3 +++
>  include/linux/serial_core.h         | 1 +
>  3 files changed, 6 insertions(+)
> 
> diff --git a/drivers/tty/serial/8250/8250_dma.c b/drivers/tty/serial/8250/8250_dma.c
> index 890fa7ddaa7f..592b9906e276 100644
> --- a/drivers/tty/serial/8250/8250_dma.c
> +++ b/drivers/tty/serial/8250/8250_dma.c
> @@ -48,6 +48,7 @@ static void __dma_rx_complete(void *param)
>         struct dma_tx_state     state;
>         int                     count;
> 
> +       spin_lock(&p->port.rx_lock);
>         dma->rx_running = 0;
>         dmaengine_tx_status(dma->rxchan, dma->rx_cookie, &state);
> 
> @@ -55,6 +56,7 @@ static void __dma_rx_complete(void *param)
> 
>         tty_insert_flip_string(tty_port, dma->rx_buf, count);
>         p->port.icount.rx += count;
> +       spin_unlock(&p->port.rx_lock);
> 
>         tty_flip_buffer_push(tty_port);
>  }
> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> index 5775cbff8f6e..4d8662df8d61 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -1780,6 +1780,7 @@ unsigned char serial8250_rx_chars(struct uart_8250_port *up, unsigned char lsr)
>         struct uart_port *port = &up->port;
>         int max_count = 256;
> 
> +       spin_lock(&port->rx_lock);
>         do {
>                 serial8250_read_char(up, lsr);
>                 if (--max_count == 0)
> @@ -1787,6 +1788,7 @@ unsigned char serial8250_rx_chars(struct uart_8250_port *up, unsigned char lsr)
>                 lsr = serial_in(up, UART_LSR);
>         } while (lsr & (UART_LSR_DR | UART_LSR_BI));
> 
> +       spin_unlock(&port->rx_lock);
>         tty_flip_buffer_push(&port->state->port);
>         return lsr;
>  }
> @@ -3267,6 +3269,7 @@ void serial8250_init_port(struct uart_8250_port *up)
>         struct uart_port *port = &up->port;
> 
>         spin_lock_init(&port->lock);
> +       spin_lock_init(&port->rx_lock);
>         port->ops = &serial8250_pops;
>         port->has_sysrq = IS_ENABLED(CONFIG_SERIAL_8250_CONSOLE);
> 
> diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
> index c58cc142d23f..77980b6f0c27 100644
> --- a/include/linux/serial_core.h
> +++ b/include/linux/serial_core.h
> @@ -105,6 +105,7 @@ typedef unsigned int __bitwise upstat_t;
> 
>  struct uart_port {
>         spinlock_t              lock;                   /* port lock */
> +       spinlock_t              rx_lock;                /* port rx lock */

Why can you not just use 'lock' here instead if this is really an issue?

And doesn't this slow things down?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* 答复: [PATCH] serial: 8250: add lock for dma rx
  2021-12-09  7:38 ` [PATCH] serial: 8250: add lock for dma rx Greg KH
@ 2021-12-09  8:15   ` wigin zeng
  2021-12-09  8:42     ` Greg KH
  0 siblings, 1 reply; 12+ messages in thread
From: wigin zeng @ 2021-12-09  8:15 UTC (permalink / raw)
  To: Greg KH; +Cc: jirislaby, linux-serial, linux-kernel

We encountered this issue when UART transfer very intensive.
DMA irq-thread processed on CPU0 and serial irq-thread executing on CPU1, 
In DMA irq-thread will invoke "tty_insert_filp_string" function to add the rx_buf into tty_buffer.
In serial irq-thread also has chance to access tty_insert_flip_char(in serial8250_rx_chars ) to access tty_buffer.
there is race condition, sometimes will cause panic.
We add the spin_lock to sync the tty_buffer operation, and the issue gone after applied the patch.

BRs
Weijun

-----邮件原件-----
发件人: Greg KH [mailto:gregkh@linuxfoundation.org] 
发送时间: 2021年12月9日 15:39
收件人: wigin zeng <wigin.zeng@dji.com>
抄送: jirislaby@kernel.org; linux-serial@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH] serial: 8250: add lock for dma rx

【EXTERNAL EMAIL】 DO NOT CLICK any links or attachments unless you can make sure both the sender and the content are trustworthy.


【外部邮件提醒】以下邮件来源于公司外部,请勿点击链接或附件,除非您确认邮件发件人和内容可信。



On Thu, Dec 09, 2021 at 03:33:39PM +0800, wigin.zeng wrote:
> Need to add lock to protect the tty buffer in dma rx handler and 
> serial interrupt handler, there is chance that serial handler and dma 
> handler executing in same time in multi cores and RT enabled scenario.

Are you sure?  Why has this not been a problem before now?  What changed?

> Signed-off-by: wigin.zeng <wigin.zeng@dji.com>

I do not think you have a "." in the name you use to sign documents, right?  Please use your real name here.


> ---
>  drivers/tty/serial/8250/8250_dma.c  | 2 ++  
> drivers/tty/serial/8250/8250_port.c | 3 +++
>  include/linux/serial_core.h         | 1 +
>  3 files changed, 6 insertions(+)
>
> diff --git a/drivers/tty/serial/8250/8250_dma.c 
> b/drivers/tty/serial/8250/8250_dma.c
> index 890fa7ddaa7f..592b9906e276 100644
> --- a/drivers/tty/serial/8250/8250_dma.c
> +++ b/drivers/tty/serial/8250/8250_dma.c
> @@ -48,6 +48,7 @@ static void __dma_rx_complete(void *param)
>         struct dma_tx_state     state;
>         int                     count;
>
> +       spin_lock(&p->port.rx_lock);
>         dma->rx_running = 0;
>         dmaengine_tx_status(dma->rxchan, dma->rx_cookie, &state);
>
> @@ -55,6 +56,7 @@ static void __dma_rx_complete(void *param)
>
>         tty_insert_flip_string(tty_port, dma->rx_buf, count);
>         p->port.icount.rx += count;
> +       spin_unlock(&p->port.rx_lock);
>
>         tty_flip_buffer_push(tty_port);  } diff --git 
> a/drivers/tty/serial/8250/8250_port.c 
> b/drivers/tty/serial/8250/8250_port.c
> index 5775cbff8f6e..4d8662df8d61 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -1780,6 +1780,7 @@ unsigned char serial8250_rx_chars(struct uart_8250_port *up, unsigned char lsr)
>         struct uart_port *port = &up->port;
>         int max_count = 256;
>
> +       spin_lock(&port->rx_lock);
>         do {
>                 serial8250_read_char(up, lsr);
>                 if (--max_count == 0)
> @@ -1787,6 +1788,7 @@ unsigned char serial8250_rx_chars(struct uart_8250_port *up, unsigned char lsr)
>                 lsr = serial_in(up, UART_LSR);
>         } while (lsr & (UART_LSR_DR | UART_LSR_BI));
>
> +       spin_unlock(&port->rx_lock);
>         tty_flip_buffer_push(&port->state->port);
>         return lsr;
>  }
> @@ -3267,6 +3269,7 @@ void serial8250_init_port(struct uart_8250_port *up)
>         struct uart_port *port = &up->port;
>
>         spin_lock_init(&port->lock);
> +       spin_lock_init(&port->rx_lock);
>         port->ops = &serial8250_pops;
>         port->has_sysrq = IS_ENABLED(CONFIG_SERIAL_8250_CONSOLE);
>
> diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h 
> index c58cc142d23f..77980b6f0c27 100644
> --- a/include/linux/serial_core.h
> +++ b/include/linux/serial_core.h
> @@ -105,6 +105,7 @@ typedef unsigned int __bitwise upstat_t;
>
>  struct uart_port {
>         spinlock_t              lock;                   /* port lock */
> +       spinlock_t              rx_lock;                /* port rx lock */

Why can you not just use 'lock' here instead if this is really an issue?

And doesn't this slow things down?

thanks,

greg k-h
This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

此电子邮件及附件所包含内容具有机密性,且仅限于接收人使用。未经允许,禁止第三人阅读、复制或传播该电子邮件中的任何信息。如果您不属于以上电子邮件的目标接收者,请您立即通知发送人并删除原电子邮件及其相关的附件。

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 答复: [PATCH] serial: 8250: add lock for dma rx
  2021-12-09  8:15   ` 答复: " wigin zeng
@ 2021-12-09  8:42     ` Greg KH
  2021-12-09  9:08       ` 答复: " wigin zeng
  0 siblings, 1 reply; 12+ messages in thread
From: Greg KH @ 2021-12-09  8:42 UTC (permalink / raw)
  To: wigin zeng; +Cc: jirislaby, linux-serial, linux-kernel

A: http://en.wikipedia.org/wiki/Top_post
Q: Were do I find info about this thing called top-posting?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Thu, Dec 09, 2021 at 08:15:00AM +0000, wigin zeng wrote:
> We encountered this issue when UART transfer very intensive.

What issue exactly?

> DMA irq-thread processed on CPU0 and serial irq-thread executing on CPU1, 
> In DMA irq-thread will invoke "tty_insert_filp_string" function to add the rx_buf into tty_buffer.
> In serial irq-thread also has chance to access tty_insert_flip_char(in serial8250_rx_chars ) to access tty_buffer.
> there is race condition, sometimes will cause panic.

But what data is being accessed at the same time to cause a crash?
How is data being added into the buffer at the same time in two
different places into the same queue?  What userspace programs are
causing this?

> We add the spin_lock to sync the tty_buffer operation, and the issue gone after applied the patch.

So all tty buffer accesses need to be protected by your new lock?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* 答复: 答复: [PATCH] serial: 8250: add lock for dma rx
  2021-12-09  8:42     ` Greg KH
@ 2021-12-09  9:08       ` wigin zeng
  2021-12-09 10:07         ` Greg KH
  0 siblings, 1 reply; 12+ messages in thread
From: wigin zeng @ 2021-12-09  9:08 UTC (permalink / raw)
  To: Greg KH; +Cc: jirislaby, linux-serial, linux-kernel, First Light

> What issue exactly?
The interval of UART input packages is very small(1ms~ 10ms), and some package size larger than configured DMA transfer size.

> But what data is being accessed at the same time to cause a crash?
> How is data being added into the buffer at the same time in two different places into the same queue?  What userspace programs are causing this?
Both places will modify tty_port *port->buf.tail (kmalloc operation and write the data/flag into this address)

>So all tty buffer accesses need to be protected by your new lock?
New lock only protected the tty_buffer alloc and write operation in serial-tty case. 

BRs
Wigin
-----邮件原件-----
发件人: Greg KH [mailto:gregkh@linuxfoundation.org] 
发送时间: 2021年12月9日 16:42
收件人: wigin zeng <wigin.zeng@dji.com>
抄送: jirislaby@kernel.org; linux-serial@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: 答复: [PATCH] serial: 8250: add lock for dma rx

【EXTERNAL EMAIL】 DO NOT CLICK any links or attachments unless you can make sure both the sender and the content are trustworthy.


【外部邮件提醒】以下邮件来源于公司外部,请勿点击链接或附件,除非您确认邮件发件人和内容可信。



A: http://en.wikipedia.org/wiki/Top_post
Q: Were do I find info about this thing called top-posting?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Thu, Dec 09, 2021 at 08:15:00AM +0000, wigin zeng wrote:
> We encountered this issue when UART transfer very intensive.

What issue exactly?

> DMA irq-thread processed on CPU0 and serial irq-thread executing on 
> CPU1, In DMA irq-thread will invoke "tty_insert_filp_string" function to add the rx_buf into tty_buffer.
> In serial irq-thread also has chance to access tty_insert_flip_char(in serial8250_rx_chars ) to access tty_buffer.
> there is race condition, sometimes will cause panic.

But what data is being accessed at the same time to cause a crash?
How is data being added into the buffer at the same time in two different places into the same queue?  What userspace programs are causing this?

> We add the spin_lock to sync the tty_buffer operation, and the issue gone after applied the patch.

So all tty buffer accesses need to be protected by your new lock?

thanks,

greg k-h
This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

此电子邮件及附件所包含内容具有机密性,且仅限于接收人使用。未经允许,禁止第三人阅读、复制或传播该电子邮件中的任何信息。如果您不属于以上电子邮件的目标接收者,请您立即通知发送人并删除原电子邮件及其相关的附件。

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 答复: 答复: [PATCH] serial: 8250: add lock for dma rx
  2021-12-09  9:08       ` 答复: " wigin zeng
@ 2021-12-09 10:07         ` Greg KH
  2021-12-20  5:27           ` 答复: " wigin zeng
  0 siblings, 1 reply; 12+ messages in thread
From: Greg KH @ 2021-12-09 10:07 UTC (permalink / raw)
  To: wigin zeng; +Cc: jirislaby, linux-serial, linux-kernel, First Light

On Thu, Dec 09, 2021 at 09:08:51AM +0000, wigin zeng wrote:
> > What issue exactly?
> The interval of UART input packages is very small(1ms~ 10ms), and some package size larger than configured DMA transfer size.

What do you mean exactly by "package size"?  Isn't it up to the DMA
transfer to do the whole copy?

> > But what data is being accessed at the same time to cause a crash?
> > How is data being added into the buffer at the same time in two different places into the same queue?  What userspace programs are causing this?
> Both places will modify tty_port *port->buf.tail (kmalloc operation and write the data/flag into this address)

What is the "second" place here?  Data should only be coming in from the
DMA transfer copy, right?

> >So all tty buffer accesses need to be protected by your new lock?
> New lock only protected the tty_buffer alloc and write operation in serial-tty case. 

You need to document that really well as it does not seem that all
accesses are now protected by this lock.  Only one odd access is, and I
still do not understand how this is happening at the same time now.

Again, what changed recently to cause this to start happening?  Why is
this only showing up now?  What is unique about your system that causes
this and prevents it from happening on any other system?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* 答复: 答复: 答复: [PATCH] serial: 8250: add lock for dma rx
  2021-12-09 10:07         ` Greg KH
@ 2021-12-20  5:27           ` wigin zeng
  2021-12-20  8:54             ` Greg KH
  0 siblings, 1 reply; 12+ messages in thread
From: wigin zeng @ 2021-12-20  5:27 UTC (permalink / raw)
  To: Greg KH; +Cc: jirislaby, linux-serial, linux-kernel, First Light

[-- Attachment #1: Type: text/plain, Size: 3772 bytes --]

Sorry for late response.

> >> What issue exactly?
> The interval of UART input packages is very small(1ms~ 10ms), and some package size larger than configured DMA transfer size.
>What do you mean exactly by "package size"?  Isn't it up to the DMA transfer to do the whole copy?
 
The attachment is an example for the race condition issue. E.g: 514bytes input stream from UART, 512bytes should be copied by DMA(block size set as 512), left 2bytes should be copied by serial interrupt handler.

>Again, what changed recently to cause this to start happening?  Why is this only showing up now?  What is unique about your system that causes this and prevents it from happening on any other system?
I think it is a corner case and exist in previous kernel version, we just reproduced it in pressure test.
Our system running multi cores and enabled RT feature, DMA interrupt thread and serial interrupt thread are running on different cores in parallel.
If there is no sync lock, will easy to reproduce.

BRs
Weijun
-----邮件原件-----
发件人: Greg KH [mailto:gregkh@linuxfoundation.org] 
发送时间: 2021年12月9日 18:07
收件人: wigin zeng <wigin.zeng@dji.com>
抄送: jirislaby@kernel.org; linux-serial@vger.kernel.org; linux-kernel@vger.kernel.org; First Light <xiaoguang.chen@dji.com>
主题: Re: 答复: 答复: [PATCH] serial: 8250: add lock for dma rx

【EXTERNAL EMAIL】 DO NOT CLICK any links or attachments unless you can make sure both the sender and the content are trustworthy.


【外部邮件提醒】以下邮件来源于公司外部,请勿点击链接或附件,除非您确认邮件发件人和内容可信。



On Thu, Dec 09, 2021 at 09:08:51AM +0000, wigin zeng wrote:
> > What issue exactly?
> The interval of UART input packages is very small(1ms~ 10ms), and some package size larger than configured DMA transfer size.

What do you mean exactly by "package size"?  Isn't it up to the DMA transfer to do the whole copy?

> > But what data is being accessed at the same time to cause a crash?
> > How is data being added into the buffer at the same time in two different places into the same queue?  What userspace programs are causing this?
> Both places will modify tty_port *port->buf.tail (kmalloc operation 
> and write the data/flag into this address)

What is the "second" place here?  Data should only be coming in from the DMA transfer copy, right?

> >So all tty buffer accesses need to be protected by your new lock?
> New lock only protected the tty_buffer alloc and write operation in serial-tty case.

You need to document that really well as it does not seem that all accesses are now protected by this lock.  Only one odd access is, and I still do not understand how this is happening at the same time now.

Again, what changed recently to cause this to start happening?  Why is this only showing up now?  What is unique about your system that causes this and prevents it from happening on any other system?

thanks,

greg k-h
This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

此电子邮件及附件所包含内容具有机密性,且仅限于接收人使用。未经允许,禁止第三人阅读、复制或传播该电子邮件中的任何信息。如果您不属于以上电子邮件的目标接收者,请您立即通知发送人并删除原电子邮件及其相关的附件。

[-- Attachment #2: serial_dma_race_condition.jpg --]
[-- Type: image/jpeg, Size: 44410 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 答复: 答复: 答复: [PATCH] serial: 8250: add lock for dma rx
  2021-12-20  5:27           ` 答复: " wigin zeng
@ 2021-12-20  8:54             ` Greg KH
  2021-12-20  9:44               ` 答复: " wigin zeng
  0 siblings, 1 reply; 12+ messages in thread
From: Greg KH @ 2021-12-20  8:54 UTC (permalink / raw)
  To: wigin zeng; +Cc: jirislaby, linux-serial, linux-kernel, First Light

On Mon, Dec 20, 2021 at 05:27:24AM +0000, wigin zeng wrote:
> Sorry for late response.
> 
> > >> What issue exactly?
> > The interval of UART input packages is very small(1ms~ 10ms), and some package size larger than configured DMA transfer size.
> >What do you mean exactly by "package size"?  Isn't it up to the DMA transfer to do the whole copy?
>  
> The attachment is an example for the race condition issue. E.g: 514bytes input stream from UART, 512bytes should be copied by DMA(block size set as 512), left 2bytes should be copied by serial interrupt handler.

That makes no sense, as what orders the data coming in?  The 2 bytes
could be added to the tty buffer before the 512 bytes, or the other way
around.

What hardware are you using that is mixing dma and irq data like this?
That feels very wrong.

> >Again, what changed recently to cause this to start happening?  Why is this only showing up now?  What is unique about your system that causes this and prevents it from happening on any other system?
> I think it is a corner case and exist in previous kernel version, we just reproduced it in pressure test.
> Our system running multi cores and enabled RT feature, DMA interrupt thread and serial interrupt thread are running on different cores in parallel.

If they are running on different cores, then you will have data
corruption issues no matter if you have a lock or not, so this is not
the correct solution for this hardware configuration problem.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* 答复: 答复: 答复: 答复: [PATCH] serial: 8250: add lock for dma rx
  2021-12-20  8:54             ` Greg KH
@ 2021-12-20  9:44               ` wigin zeng
  2021-12-20  9:59                 ` Greg KH
  0 siblings, 1 reply; 12+ messages in thread
From: wigin zeng @ 2021-12-20  9:44 UTC (permalink / raw)
  To: Greg KH; +Cc: jirislaby, linux-serial, linux-kernel, First Light

>That makes no sense, as what orders the data coming in?  The 2 bytes could be added to the tty buffer before the 512 bytes, or the other way around.

>What hardware are you using that is mixing dma and irq data like this?
>That feels very wrong.

It is not normal case, normally, the input size should smaller than DMA block size and DMA complete the whole copy.
However, there are some abnormal situations. The external input is unexpectedly larger than the data length of the DMA configuration. This situation in my example will appear, and it may cause the kernel to panic.

>If they are running on different cores, then you will have data corruption issues no matter if you have a lock or not, so this is not the correct solution for this hardware configuration problem.

The purpose of adding lock is to ensure that the kernel will not panic in this extreme case, If you want to ensure the integrity of the serial port data, you need to add more flow control logic

BRs
Weijun
-----邮件原件-----
发件人: Greg KH [mailto:gregkh@linuxfoundation.org] 
发送时间: 2021年12月20日 16:55
收件人: wigin zeng <wigin.zeng@dji.com>
抄送: jirislaby@kernel.org; linux-serial@vger.kernel.org; linux-kernel@vger.kernel.org; First Light <xiaoguang.chen@dji.com>
主题: Re: 答复: 答复: 答复: [PATCH] serial: 8250: add lock for dma rx

【EXTERNAL EMAIL】 DO NOT CLICK any links or attachments unless you can make sure both the sender and the content are trustworthy.


【外部邮件提醒】以下邮件来源于公司外部,请勿点击链接或附件,除非您确认邮件发件人和内容可信。



On Mon, Dec 20, 2021 at 05:27:24AM +0000, wigin zeng wrote:
> Sorry for late response.
>
> > >> What issue exactly?
> > The interval of UART input packages is very small(1ms~ 10ms), and some package size larger than configured DMA transfer size.
> >What do you mean exactly by "package size"?  Isn't it up to the DMA transfer to do the whole copy?
>
> The attachment is an example for the race condition issue. E.g: 514bytes input stream from UART, 512bytes should be copied by DMA(block size set as 512), left 2bytes should be copied by serial interrupt handler.

That makes no sense, as what orders the data coming in?  The 2 bytes could be added to the tty buffer before the 512 bytes, or the other way around.

What hardware are you using that is mixing dma and irq data like this?
That feels very wrong.

> >Again, what changed recently to cause this to start happening?  Why is this only showing up now?  What is unique about your system that causes this and prevents it from happening on any other system?
> I think it is a corner case and exist in previous kernel version, we just reproduced it in pressure test.
> Our system running multi cores and enabled RT feature, DMA interrupt thread and serial interrupt thread are running on different cores in parallel.

If they are running on different cores, then you will have data corruption issues no matter if you have a lock or not, so this is not the correct solution for this hardware configuration problem.

thanks,

greg k-h
This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

此电子邮件及附件所包含内容具有机密性,且仅限于接收人使用。未经允许,禁止第三人阅读、复制或传播该电子邮件中的任何信息。如果您不属于以上电子邮件的目标接收者,请您立即通知发送人并删除原电子邮件及其相关的附件。

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 答复: 答复: 答复: 答复: [PATCH] serial: 8250: add lock for dma rx
  2021-12-20  9:44               ` 答复: " wigin zeng
@ 2021-12-20  9:59                 ` Greg KH
  2021-12-20 10:25                   ` 答复: " wigin zeng
  0 siblings, 1 reply; 12+ messages in thread
From: Greg KH @ 2021-12-20  9:59 UTC (permalink / raw)
  To: wigin zeng; +Cc: jirislaby, linux-serial, linux-kernel, First Light

On Mon, Dec 20, 2021 at 09:44:04AM +0000, wigin zeng wrote:
> >That makes no sense, as what orders the data coming in?  The 2 bytes could be added to the tty buffer before the 512 bytes, or the other way around.
> 
> >What hardware are you using that is mixing dma and irq data like this?
> >That feels very wrong.
> 
> It is not normal case, normally, the input size should smaller than DMA block size and DMA complete the whole copy.
> However, there are some abnormal situations. The external input is unexpectedly larger than the data length of the DMA configuration. This situation in my example will appear, and it may cause the kernel to panic.

You did not answer my question about hardware type :(

And again, how is this happening?  If you use DMA, all data should be
coming through DMA and not the irq.  Otherwise crazy stuff like this
will happen in any type of driver, your hardware can not mix this type
of stuff up.

> >If they are running on different cores, then you will have data corruption issues no matter if you have a lock or not, so this is not the correct solution for this hardware configuration problem.
> 
> The purpose of adding lock is to ensure that the kernel will not panic in this extreme case, If you want to ensure the integrity of the serial port data, you need to add more flow control logic

How can flow control handle this at all?  Flow control is at the serial
data stream level.  This is confusing the PCI data stream order.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* 答复: 答复: 答复: 答复: 答复: [PATCH] serial: 8250: add lock for dma rx
  2021-12-20  9:59                 ` Greg KH
@ 2021-12-20 10:25                   ` wigin zeng
  2021-12-20 10:40                     ` Greg KH
  0 siblings, 1 reply; 12+ messages in thread
From: wigin zeng @ 2021-12-20 10:25 UTC (permalink / raw)
  To: Greg KH; +Cc: jirislaby, linux-serial, linux-kernel, First Light

On Mon, Dec 20, 2021 at 09:44:04AM +0000, wigin zeng wrote:
> > >That makes no sense, as what orders the data coming in?  The 2 bytes could be added to the tty buffer before the 512 bytes, or the other way around.
>
> > >What hardware are you using that is mixing dma and irq data like this?
> > >That feels very wrong.
>
> >It is not normal case, normally, the input size should smaller than DMA block size and DMA complete the whole copy.
> >However, there are some abnormal situations. The external input is unexpectedly larger than the data length of the DMA configuration. This situation in my example will appear, and it may cause the kernel to panic.

>You did not answer my question about hardware type :(

>And again, how is this happening?  If you use DMA, all data should be coming through DMA and not the irq.  Otherwise crazy stuff like this will happen in any type of driver, your hardware can not mix this type of stuff up.

On our platform, UART connected to a MCU which will send data of variable length from time to time. There is no definition of a maximum transmission length.
We configured DMA block size is 4096bytes, however, there are more than 4100 bytes input, DMA just handled 4096bytes and left bytes in FIFO cannot trigger next DMA 
Transfer done interrupt(left bytes number < DMA block size ), so these data should be processed by UART IRQ.

In other word, if the external use UART "vulnerability" to attack the system, we need to ensure that the system not crash at least, right?

>How can flow control handle this at all?  Flow control is at the serial data stream level.  This is confusing the PCI data stream order.

I just think more logic is needed to control the order of data processing by DMA and UART IRQ to keep the integrity of serial data. 
But the specific design, I haven't considered yet, the first goal is the keep the system alive.

BRs
Weijun
 
-----邮件原件-----
发件人: Greg KH [mailto:gregkh@linuxfoundation.org] 
发送时间: 2021年12月20日 17:59
收件人: wigin zeng <wigin.zeng@dji.com>
抄送: jirislaby@kernel.org; linux-serial@vger.kernel.org; linux-kernel@vger.kernel.org; First Light <xiaoguang.chen@dji.com>
主题: Re: 答复: 答复: 答复: 答复: [PATCH] serial: 8250: add lock for dma rx

【EXTERNAL EMAIL】 DO NOT CLICK any links or attachments unless you can make sure both the sender and the content are trustworthy.


【外部邮件提醒】以下邮件来源于公司外部,请勿点击链接或附件,除非您确认邮件发件人和内容可信。



On Mon, Dec 20, 2021 at 09:44:04AM +0000, wigin zeng wrote:
> >That makes no sense, as what orders the data coming in?  The 2 bytes could be added to the tty buffer before the 512 bytes, or the other way around.
>
> >What hardware are you using that is mixing dma and irq data like this?
> >That feels very wrong.
>
> It is not normal case, normally, the input size should smaller than DMA block size and DMA complete the whole copy.
> However, there are some abnormal situations. The external input is unexpectedly larger than the data length of the DMA configuration. This situation in my example will appear, and it may cause the kernel to panic.

You did not answer my question about hardware type :(

And again, how is this happening?  If you use DMA, all data should be coming through DMA and not the irq.  Otherwise crazy stuff like this will happen in any type of driver, your hardware can not mix this type of stuff up.

> >If they are running on different cores, then you will have data corruption issues no matter if you have a lock or not, so this is not the correct solution for this hardware configuration problem.
>
> The purpose of adding lock is to ensure that the kernel will not panic 
> in this extreme case, If you want to ensure the integrity of the 
> serial port data, you need to add more flow control logic

How can flow control handle this at all?  Flow control is at the serial data stream level.  This is confusing the PCI data stream order.

thanks,

greg k-h
This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

此电子邮件及附件所包含内容具有机密性,且仅限于接收人使用。未经允许,禁止第三人阅读、复制或传播该电子邮件中的任何信息。如果您不属于以上电子邮件的目标接收者,请您立即通知发送人并删除原电子邮件及其相关的附件。

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: 答复: 答复: 答复: 答复: 答复: [PATCH] serial: 8250: add lock for dma rx
  2021-12-20 10:25                   ` 答复: " wigin zeng
@ 2021-12-20 10:40                     ` Greg KH
  2021-12-30  4:41                       ` 答复: " wigin zeng
  0 siblings, 1 reply; 12+ messages in thread
From: Greg KH @ 2021-12-20 10:40 UTC (permalink / raw)
  To: wigin zeng; +Cc: jirislaby, linux-serial, linux-kernel, First Light

On Mon, Dec 20, 2021 at 10:25:51AM +0000, wigin zeng wrote:
> On Mon, Dec 20, 2021 at 09:44:04AM +0000, wigin zeng wrote:
> > > >That makes no sense, as what orders the data coming in?  The 2 bytes could be added to the tty buffer before the 512 bytes, or the other way around.
> >
> > > >What hardware are you using that is mixing dma and irq data like this?
> > > >That feels very wrong.
> >
> > >It is not normal case, normally, the input size should smaller than DMA block size and DMA complete the whole copy.
> > >However, there are some abnormal situations. The external input is unexpectedly larger than the data length of the DMA configuration. This situation in my example will appear, and it may cause the kernel to panic.
> 
> >You did not answer my question about hardware type :(
> 
> >And again, how is this happening?  If you use DMA, all data should be coming through DMA and not the irq.  Otherwise crazy stuff like this will happen in any type of driver, your hardware can not mix this type of stuff up.
> 
> On our platform, UART connected to a MCU which will send data of variable length from time to time. There is no definition of a maximum transmission length.
> We configured DMA block size is 4096bytes, however, there are more than 4100 bytes input, DMA just handled 4096bytes and left bytes in FIFO cannot trigger next DMA 
> Transfer done interrupt(left bytes number < DMA block size ), so these data should be processed by UART IRQ.

That is a broken hardware design and will not work with any operating
system.

> In other word, if the external use UART "vulnerability" to attack the system, we need to ensure that the system not crash at least, right?

So you are saying that Linux now treat all hardware that has DMA
functionality as a potential threat?  That is not a model that Linux, or
any other operating system, has ever had to support before, please do
not make up new rules here and expect Linux to automatically support
them without a lot of redesign and work.

If you wish to protect Linux from this type of untrusted hardware,
please do the work to do so.  This patch is not that work.

> >How can flow control handle this at all?  Flow control is at the serial data stream level.  This is confusing the PCI data stream order.
> 
> I just think more logic is needed to control the order of data processing by DMA and UART IRQ to keep the integrity of serial data. 
> But the specific design, I haven't considered yet, the first goal is the keep the system alive.

Again, this is a broken hardware design, please fix that first.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* 答复: 答复: 答复: 答复: 答复: 答复: [PATCH] serial: 8250: add lock for dma rx
  2021-12-20 10:40                     ` Greg KH
@ 2021-12-30  4:41                       ` wigin zeng
  0 siblings, 0 replies; 12+ messages in thread
From: wigin zeng @ 2021-12-30  4:41 UTC (permalink / raw)
  To: Greg KH; +Cc: jirislaby, linux-serial, linux-kernel, First Light

>> On our platform, UART connected to a MCU which will send data of variable length from time to time. There is no definition of a maximum transmission length.
> >We configured DMA block size is 4096bytes, however, there are more 
> >than 4100 bytes input, DMA just handled 4096bytes and left bytes in FIFO cannot trigger next DMA Transfer done interrupt(left bytes number < DMA block size ), so these data should be processed by UART IRQ.

>That is a broken hardware design and will not work with any operating system.

Do you mean the data size of UART input must be smaller than DMA configured RX block size? If not, there is risk to cause panic with current driver.
We cannot limit the length of data sent by an external device at a time. No matter how much data input externally, we should ensure system not crash, this patch achieves this goal.

BRs
Weijun
-----邮件原件-----
发件人: Greg KH [mailto:gregkh@linuxfoundation.org] 
发送时间: 2021年12月20日 18:41
收件人: wigin zeng <wigin.zeng@dji.com>
抄送: jirislaby@kernel.org; linux-serial@vger.kernel.org; linux-kernel@vger.kernel.org; First Light <xiaoguang.chen@dji.com>
主题: Re: 答复: 答复: 答复: 答复: 答复: [PATCH] serial: 8250: add lock for dma rx

【EXTERNAL EMAIL】 DO NOT CLICK any links or attachments unless you can make sure both the sender and the content are trustworthy.


【外部邮件提醒】以下邮件来源于公司外部,请勿点击链接或附件,除非您确认邮件发件人和内容可信。



On Mon, Dec 20, 2021 at 10:25:51AM +0000, wigin zeng wrote:
> On Mon, Dec 20, 2021 at 09:44:04AM +0000, wigin zeng wrote:
> > > >That makes no sense, as what orders the data coming in?  The 2 bytes could be added to the tty buffer before the 512 bytes, or the other way around.
> >
> > > >What hardware are you using that is mixing dma and irq data like this?
> > > >That feels very wrong.
> >
> > >It is not normal case, normally, the input size should smaller than DMA block size and DMA complete the whole copy.
> > >However, there are some abnormal situations. The external input is unexpectedly larger than the data length of the DMA configuration. This situation in my example will appear, and it may cause the kernel to panic.
>
> >You did not answer my question about hardware type :(
>
> >And again, how is this happening?  If you use DMA, all data should be coming through DMA and not the irq.  Otherwise crazy stuff like this will happen in any type of driver, your hardware can not mix this type of stuff up.
>
> On our platform, UART connected to a MCU which will send data of variable length from time to time. There is no definition of a maximum transmission length.
> We configured DMA block size is 4096bytes, however, there are more 
> than 4100 bytes input, DMA just handled 4096bytes and left bytes in FIFO cannot trigger next DMA Transfer done interrupt(left bytes number < DMA block size ), so these data should be processed by UART IRQ.

That is a broken hardware design and will not work with any operating system.

> In other word, if the external use UART "vulnerability" to attack the system, we need to ensure that the system not crash at least, right?

So you are saying that Linux now treat all hardware that has DMA functionality as a potential threat?  That is not a model that Linux, or any other operating system, has ever had to support before, please do not make up new rules here and expect Linux to automatically support them without a lot of redesign and work.

If you wish to protect Linux from this type of untrusted hardware, please do the work to do so.  This patch is not that work.

> >How can flow control handle this at all?  Flow control is at the serial data stream level.  This is confusing the PCI data stream order.
>
> I just think more logic is needed to control the order of data processing by DMA and UART IRQ to keep the integrity of serial data.
> But the specific design, I haven't considered yet, the first goal is the keep the system alive.

Again, this is a broken hardware design, please fix that first.

thanks,

greg k-h
This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

此电子邮件及附件所包含内容具有机密性,且仅限于接收人使用。未经允许,禁止第三人阅读、复制或传播该电子邮件中的任何信息。如果您不属于以上电子邮件的目标接收者,请您立即通知发送人并删除原电子邮件及其相关的附件。

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-12-30  4:49 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20211209073339.21694-1-wigin.zeng@dji.com>
2021-12-09  7:38 ` [PATCH] serial: 8250: add lock for dma rx Greg KH
2021-12-09  8:15   ` 答复: " wigin zeng
2021-12-09  8:42     ` Greg KH
2021-12-09  9:08       ` 答复: " wigin zeng
2021-12-09 10:07         ` Greg KH
2021-12-20  5:27           ` 答复: " wigin zeng
2021-12-20  8:54             ` Greg KH
2021-12-20  9:44               ` 答复: " wigin zeng
2021-12-20  9:59                 ` Greg KH
2021-12-20 10:25                   ` 答复: " wigin zeng
2021-12-20 10:40                     ` Greg KH
2021-12-30  4:41                       ` 答复: " wigin zeng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.