All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] SLIP SLIP-Improve robustness to crashing
@ 2013-07-02 15:31 Dean Jenkins
  2013-07-02 15:31 ` [PATCH 1/5] Bluetooth: Add RFCOMM TTY write return error codes Dean Jenkins
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Dean Jenkins @ 2013-07-02 15:31 UTC (permalink / raw)
  To: Andre Naujoks, linux-kernel; +Cc: Jiri Slaby, Greg Kroah-Hartman

Using SLIP bound to RFCOMM or PTY/TTY has identified some weaknesses to crashing
under abnormal conditions.

Here is a proposed patchset baselined and built on Linux 3.9.

Note the patches have not been tested on x86 Linux 3.9. However similar patches
have been used on ARM Linux 2.6.34 to avoid kernel crashes in a commercial
project. I believe the same weaknesses still exist in Linux 3.9.

If some or all of the patches look to be useful to the community then I may
attempt to test on x86 but this is not straight forward for me.

I welcome any feedback and whether the fixes are a suitable solution.

Who is the maintainer of SLIP in the kernel ?

The patchset consists of:

0001-Bluetooth-Add-RFCOMM-TTY-write-return-error-codes.patch
0002-SLIP-Handle-error-codes-from-the-TTY-layer.patch
0003-SLIP-Prevent-recursion-stack-overflow-and-scheduler-.patch
0004-SLIP-Add-error-message-for-xleft-non-zero.patch
0005-SLIP-Fix-transmission-segmentation-mechanism.patches

Some background:

0001-Bluetooth-Add-RFCOMM-TTY-write-return-error-codes.patch
This patch is a Bluetooth change to add some error return codes to RFCOMM to
avoid NULL pointer dereference crashes. Note RFCOMM can already generate an
error code that will cause SLIP to malfunction.

0002-SLIP-Handle-error-codes-from-the-TTY-layer.patches
This patch allows SLIP to handle error codes from RFCOMM or other bound TTY layers.

0003-SLIP-Prevent-recursion-stack-overflow-and-scheduler-.patches
This patch prevents SLIP from causing a recursive loop that overflows the stack
and catastrophically crashes the kernel. The scenario is SLIP bound to PTY/TTY.
The underlying trigger is a probably a failure to allocate a TTY buffer in
tty_buffer_alloc() but this is unproven. The crash is sporadic in an ARM
embedded environment where resources are limited.

0004-SLIP-Add-error-message-for-xleft-non-zero.patch
This is an error message patch to identify when a SLIP frame has not been fully
transmitted meaning the frame was truncated.

0005-SLIP-Fix-transmission-segmentation-mechanism.patches
This patch allows multiple attempts to transmit segments of the SLIP frame.
Currently only 1 attempt at writing the whole SLIP frame to PTY/TTY occurs.
This could truncate transmitted SLIP frames. In addition the modification
relies on the TTY write wake-up event to complete the transmission of the
SLIP frame rather than the sl_encaps() call to pty_write(). Probably,
pty_write() should not call tty_wakeup() but safer to modify SLIP rather
than the PTY/TTY layer.

Thanks,
Dean Jenkins
Mentor Graphics
-- 
1.8.1.5


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/5] Bluetooth: Add RFCOMM TTY write return error codes
  2013-07-02 15:31 [PATCH 0/5] SLIP SLIP-Improve robustness to crashing Dean Jenkins
@ 2013-07-02 15:31 ` Dean Jenkins
  2013-07-24 23:12   ` Peter Hurley
  2013-07-02 15:31 ` [PATCH 2/5] SLIP: Handle error codes from the TTY layer Dean Jenkins
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Dean Jenkins @ 2013-07-02 15:31 UTC (permalink / raw)
  To: Andre Naujoks, linux-kernel; +Cc: Jiri Slaby, Greg Kroah-Hartman

It appears that rfcomm_tty_write() does not check that the
passed in TTY device_data is not NULL and also does not check
that the RFCOMM DLC serial data link pointer is not NULL.

A kernel crash was observed whilst SLIP was bound to /dev/rfcomm0
but the /dev/rfcomm0 had subsequently disconnected. Unfortunately,
SLIP attempted to write to the now non-existant RFCOMM TTY device
which caused a NULL pointer dereference because the device_data
no longer existed.

Therefore, add NULL pointer checks for the dev and dlc pointers
and output kernel error debug to show that NULL had been detected.

Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
---
 net/bluetooth/rfcomm/tty.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/net/bluetooth/rfcomm/tty.c b/net/bluetooth/rfcomm/tty.c
index b6e44ad..56d28d1 100644
--- a/net/bluetooth/rfcomm/tty.c
+++ b/net/bluetooth/rfcomm/tty.c
@@ -761,12 +761,24 @@ static void rfcomm_tty_close(struct tty_struct *tty, struct file *filp)
 static int rfcomm_tty_write(struct tty_struct *tty, const unsigned char *buf, int count)
 {
 	struct rfcomm_dev *dev = (struct rfcomm_dev *) tty->driver_data;
-	struct rfcomm_dlc *dlc = dev->dlc;
+	struct rfcomm_dlc *dlc;
 	struct sk_buff *skb;
 	int err = 0, sent = 0, size;
 
 	BT_DBG("tty %p count %d", tty, count);
 
+	if (!dev) {
+		BT_ERR("RFCOMM TTY device data structure does not exist");
+		return -ENODEV;
+	}
+
+	dlc = dev->dlc;
+
+	if (!dlc) {
+		BT_ERR("RFCOMM serial data link does not exist");
+		return -ENOLINK;
+	}
+
 	while (count) {
 		size = min_t(uint, count, dlc->mtu);
 
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/5] SLIP: Handle error codes from the TTY layer
  2013-07-02 15:31 [PATCH 0/5] SLIP SLIP-Improve robustness to crashing Dean Jenkins
  2013-07-02 15:31 ` [PATCH 1/5] Bluetooth: Add RFCOMM TTY write return error codes Dean Jenkins
@ 2013-07-02 15:31 ` Dean Jenkins
  2013-07-24 22:18   ` Greg Kroah-Hartman
  2013-07-02 15:31 ` [PATCH 3/5] SLIP: Prevent recursion stack overflow and scheduler crash Dean Jenkins
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Dean Jenkins @ 2013-07-02 15:31 UTC (permalink / raw)
  To: Andre Naujoks, linux-kernel; +Cc: Jiri Slaby, Greg Kroah-Hartman

It appears that SLIP does not handle error codes from the TTY layer.
This will result in a malfunction because the remaining length of
data will be corrupted by the negative error code values from the TTY
layer.

Therefore, add error code checks in sl_encaps() and sl_encaps_wakeup()
to prevent the corruption of the sent data length.

Note that SLIP is connectionless so on TTY error indicate that all data
was sent. It seems SLIP does not return error codes to the network
layer.

Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
---
 drivers/net/slip/slip.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c
index a34d6bf..bed819f 100644
--- a/drivers/net/slip/slip.c
+++ b/drivers/net/slip/slip.c
@@ -374,7 +374,7 @@ static void sl_bump(struct slip *sl)
 static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
 {
 	unsigned char *p;
-	int actual, count;
+	int actual, count, err;
 
 	if (len > sl->mtu) {		/* Sigh, shouldn't occur BUT ... */
 		printk(KERN_WARNING "%s: truncating oversized transmit packet!\n", sl->dev->name);
@@ -404,7 +404,16 @@ static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
 	 *       14 Oct 1994  Dmitry Gorodchanin.
 	 */
 	set_bit(TTY_DO_WRITE_WAKEUP, &sl->tty->flags);
-	actual = sl->tty->ops->write(sl->tty, sl->xbuff, count);
+	err = sl->tty->ops->write(sl->tty, sl->xbuff, count);
+
+	if (err < 0) {
+		/* error case, say all was sent as connectionless */
+		actual = count;
+	} else {
+		/* good case, err contains the number sent */
+		actual = err;
+	}
+
 #ifdef SL_CHECK_TRANSMIT
 	sl->dev->trans_start = jiffies;
 #endif
@@ -422,7 +431,7 @@ static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
  */
 static void slip_write_wakeup(struct tty_struct *tty)
 {
-	int actual;
+	int actual, err;
 	struct slip *sl = tty->disc_data;
 
 	/* First make sure we're connected. */
@@ -438,7 +447,16 @@ static void slip_write_wakeup(struct tty_struct *tty)
 		return;
 	}
 
-	actual = tty->ops->write(tty, sl->xhead, sl->xleft);
+	err = tty->ops->write(tty, sl->xhead, sl->xleft);
+
+	if (err < 0) {
+		/* error case, say all was sent as connectionless */
+		actual = sl->xleft;
+	} else {
+		/* good case, err contains the number sent */
+		actual = err;
+	}
+
 	sl->xleft -= actual;
 	sl->xhead += actual;
 }
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/5] SLIP: Prevent recursion stack overflow and scheduler crash
  2013-07-02 15:31 [PATCH 0/5] SLIP SLIP-Improve robustness to crashing Dean Jenkins
  2013-07-02 15:31 ` [PATCH 1/5] Bluetooth: Add RFCOMM TTY write return error codes Dean Jenkins
  2013-07-02 15:31 ` [PATCH 2/5] SLIP: Handle error codes from the TTY layer Dean Jenkins
@ 2013-07-02 15:31 ` Dean Jenkins
  2013-07-25  1:12   ` Peter Hurley
  2013-07-02 15:31 ` [PATCH 4/5] SLIP: Add error message for xleft non-zero Dean Jenkins
  2013-07-02 15:31 ` [PATCH 5/5] SLIP: Fix transmission segmentation mechanism Dean Jenkins
  4 siblings, 1 reply; 14+ messages in thread
From: Dean Jenkins @ 2013-07-02 15:31 UTC (permalink / raw)
  To: Andre Naujoks, linux-kernel; +Cc: Jiri Slaby, Greg Kroah-Hartman

This is an issue when SLIP is bound to a PTY/TTY. If
TTY_DO_WRITE_WAKEUP is set, pty_write() calls tty_wakeup()
then slip_write_wakeup() can be called. slip_write_wakeup() can
be called by pty_write(). This is a recursion loop.

pty_write() is called in sl_encaps().
slip_write_wakeup() can be called by the TTY wakeup event.

The pty_write() call in sl_encaps() will also call
slip_write_wakeup() but xleft has not been updated and contains
the value from the previous SLIP frame transmission. xleft is zero
unless the previous SLIP frame failed to be fully transmitted in
which case xleft has a positive value. A failed transmission causes
the next SLIP frame pending transmission to cause a crash.

In the failure case when xleft is positive in slip_write_wakeup(),
recursion causes the stack to overflow and task structures located
near the stack are corrupted by the stack overflow. The corrupted
task structures crash the kernel's scheduler and the system
crashes with exception handlers crashing and the emergency reboot
fails.

The recursion loop is:
slip_write_wakeup()-->pty_write()-->tty_wakeup()-->slip_write_wakeup()
etc.

Therefore ensure xleft is zero before writing the SLIP frame to the
PTY/TTY layers. This prevents the xleft value of the previous SLIP
frame from interfering with the slip_write_wakeup() execution when
SLIP is bound to a PTY/TTY.

Note the transmission segmentation mechanism is broken and only a
single call to the write() function pointer will take place per
SLIP frame. This could cause missing or truncated SLIP frames to
be transmitted when the write() function fails to write the complete
frame. In other words the TTY wakeup event does nothing because
the TTY_DO_WRITE_WAKEUP flag has been cleared.

Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
---
 drivers/net/slip/slip.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c
index bed819f..f7303e0 100644
--- a/drivers/net/slip/slip.c
+++ b/drivers/net/slip/slip.c
@@ -395,6 +395,11 @@ static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
 #endif
 		count = slip_esc(p, sl->xbuff, len);
 
+	/* ensure xleft set by the previous SLIP frame is zero for this frame
+	 * otherwise slip_write_wakeup() can cause a recursive loop.
+	 */
+	sl->xleft = 0;
+
 	/* Order of next two lines is *very* important.
 	 * When we are sending a little amount of data,
 	 * the transfer may be completed inside the ops->write()
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/5] SLIP: Add error message for xleft non-zero
  2013-07-02 15:31 [PATCH 0/5] SLIP SLIP-Improve robustness to crashing Dean Jenkins
                   ` (2 preceding siblings ...)
  2013-07-02 15:31 ` [PATCH 3/5] SLIP: Prevent recursion stack overflow and scheduler crash Dean Jenkins
@ 2013-07-02 15:31 ` Dean Jenkins
  2013-07-24 22:18   ` Greg Kroah-Hartman
  2013-07-02 15:31 ` [PATCH 5/5] SLIP: Fix transmission segmentation mechanism Dean Jenkins
  4 siblings, 1 reply; 14+ messages in thread
From: Dean Jenkins @ 2013-07-02 15:31 UTC (permalink / raw)
  To: Andre Naujoks, linux-kernel; +Cc: Jiri Slaby, Greg Kroah-Hartman

Add a printk to show when xleft is non-zero in sl_encaps.

The idea is to see whether a previous SLIP frame failed to be
fully transmitted.

Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
---
 drivers/net/slip/slip.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c
index f7303e0..e2eff84 100644
--- a/drivers/net/slip/slip.c
+++ b/drivers/net/slip/slip.c
@@ -395,6 +395,10 @@ static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
 #endif
 		count = slip_esc(p, sl->xbuff, len);
 
+	if (sl->xleft)
+		printk(KERN_ERR "%s: ERROR: xleft is non-zero %d\n",
+		       __func__, sl->xleft);
+
 	/* ensure xleft set by the previous SLIP frame is zero for this frame
 	 * otherwise slip_write_wakeup() can cause a recursive loop.
 	 */
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/5] SLIP: Fix transmission segmentation mechanism
  2013-07-02 15:31 [PATCH 0/5] SLIP SLIP-Improve robustness to crashing Dean Jenkins
                   ` (3 preceding siblings ...)
  2013-07-02 15:31 ` [PATCH 4/5] SLIP: Add error message for xleft non-zero Dean Jenkins
@ 2013-07-02 15:31 ` Dean Jenkins
  2013-07-24 22:18   ` Greg Kroah-Hartman
  2013-07-25  0:13   ` Peter Hurley
  4 siblings, 2 replies; 14+ messages in thread
From: Dean Jenkins @ 2013-07-02 15:31 UTC (permalink / raw)
  To: Andre Naujoks, linux-kernel; +Cc: Jiri Slaby, Greg Kroah-Hartman

Binding SLIP with a PTY/TTY can cause truncated SLIP frames to be
transmitted when the first write to the TTY fails to consume all
the characters of the SLIP frame.

Asynchronous to the write function is a wakeup event from the TTY
that indicates the TTY can accept more data. The wakeup event
calls tty_wakeup() which calls slip_write_wakeup() when
TTY_DO_WRITE_WAKEUP is set.

To complete the transmission of a SLIP frame to the TTY,
slip_write_wakeup() must run at least once. Unfortunately, pty_write()
also calls tty_wakeup() and when TTY_DO_WRITE_WAKEUP is set,
slip_write_wakeup() is called. xleft is always zero and
causes slip_write_wakeup() to complete the transmission and
clears the TTY_DO_WRITE_WAKEUP flag. This can cause a truncated
SLIP frame because any remaining characters will not get sent.

The wakeup event is unable to process the remaining characters
because the TTY_DO_WRITE_WAKEUP flag has been cleared.

The code modification fixes the transmission segmentation
mechanism by preventing pty_write() from calling
slip_write_wakeup() by clearing TTY_DO_WRITE_WAKEUP before calling
pty_write(). After pty_write() returns, TTY_DO_WRITE_WAKEUP is set
to allow the TTY wakeup event to call slip_write_wakeup() to
attempt to complete the transmission of the SLIP frame.

This may not be foolproof because a timeout is needed to
break out of the cycle of transmission attempts. Note error
codes from the TTY layer will break out of the cycle of
transmission attempts.

Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
---
 drivers/net/slip/slip.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c
index e2eff84..0ded23d 100644
--- a/drivers/net/slip/slip.c
+++ b/drivers/net/slip/slip.c
@@ -404,15 +404,13 @@ static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
 	 */
 	sl->xleft = 0;
 
-	/* Order of next two lines is *very* important.
-	 * When we are sending a little amount of data,
-	 * the transfer may be completed inside the ops->write()
-	 * routine, because it's running with interrupts enabled.
-	 * In this case we *never* got WRITE_WAKEUP event,
-	 * if we did not request it before write operation.
-	 *       14 Oct 1994  Dmitry Gorodchanin.
+	/* ensure slip_write_wakeup() does not run due to write()
+	 * or write_wakeup event and this prevents slip_write_wakeup()
+	 * responding to an out of date xleft value.
 	 */
-	set_bit(TTY_DO_WRITE_WAKEUP, &sl->tty->flags);
+	clear_bit(TTY_DO_WRITE_WAKEUP, &sl->tty->flags);
+
+	/* attempt to write the SLIP frame to the TTY buffer */
 	err = sl->tty->ops->write(sl->tty, sl->xbuff, count);
 
 	if (err < 0) {
@@ -432,6 +430,11 @@ static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
 	/* VSV */
 	clear_bit(SLF_OUTWAIT, &sl->flags);	/* reset outfill flag */
 #endif
+	/* xleft will be zero when all characters have been written.
+	 * if xleft is positive then additional writes are needed.
+	 * write_wakeup event is needed to complete the transmission.
+	 */
+	set_bit(TTY_DO_WRITE_WAKEUP, &sl->tty->flags);
 }
 
 /*
@@ -447,15 +450,18 @@ static void slip_write_wakeup(struct tty_struct *tty)
 	if (!sl || sl->magic != SLIP_MAGIC || !netif_running(sl->dev))
 		return;
 
+	clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
+
 	if (sl->xleft <= 0)  {
+		/* Whole SLIP frame has been written. */
 		/* Now serial buffer is almost free & we can start
 		 * transmission of another packet */
 		sl->dev->stats.tx_packets++;
-		clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
 		sl_unlock(sl);
 		return;
 	}
 
+	/* attempt to write the remaining SLIP frame characters */
 	err = tty->ops->write(tty, sl->xhead, sl->xleft);
 
 	if (err < 0) {
@@ -468,6 +474,11 @@ static void slip_write_wakeup(struct tty_struct *tty)
 
 	sl->xleft -= actual;
 	sl->xhead += actual;
+
+	/* allow the next tty wakeup event to attempt to complete
+	 * the transmission
+	 */
+	set_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
 }
 
 static void sl_tx_timeout(struct net_device *dev)
-- 
1.8.1.5


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/5] SLIP: Handle error codes from the TTY layer
  2013-07-02 15:31 ` [PATCH 2/5] SLIP: Handle error codes from the TTY layer Dean Jenkins
@ 2013-07-24 22:18   ` Greg Kroah-Hartman
  0 siblings, 0 replies; 14+ messages in thread
From: Greg Kroah-Hartman @ 2013-07-24 22:18 UTC (permalink / raw)
  To: Dean Jenkins; +Cc: Andre Naujoks, linux-kernel, Jiri Slaby

On Tue, Jul 02, 2013 at 04:31:31PM +0100, Dean Jenkins wrote:
> It appears that SLIP does not handle error codes from the TTY layer.
> This will result in a malfunction because the remaining length of
> data will be corrupted by the negative error code values from the TTY
> layer.
> 
> Therefore, add error code checks in sl_encaps() and sl_encaps_wakeup()
> to prevent the corruption of the sent data length.
> 
> Note that SLIP is connectionless so on TTY error indicate that all data
> was sent. It seems SLIP does not return error codes to the network
> layer.
> 
> Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/5] SLIP: Fix transmission segmentation mechanism
  2013-07-02 15:31 ` [PATCH 5/5] SLIP: Fix transmission segmentation mechanism Dean Jenkins
@ 2013-07-24 22:18   ` Greg Kroah-Hartman
  2013-07-25  0:13   ` Peter Hurley
  1 sibling, 0 replies; 14+ messages in thread
From: Greg Kroah-Hartman @ 2013-07-24 22:18 UTC (permalink / raw)
  To: Dean Jenkins; +Cc: Andre Naujoks, linux-kernel, Jiri Slaby

On Tue, Jul 02, 2013 at 04:31:34PM +0100, Dean Jenkins wrote:
> Binding SLIP with a PTY/TTY can cause truncated SLIP frames to be
> transmitted when the first write to the TTY fails to consume all
> the characters of the SLIP frame.
> 
> Asynchronous to the write function is a wakeup event from the TTY
> that indicates the TTY can accept more data. The wakeup event
> calls tty_wakeup() which calls slip_write_wakeup() when
> TTY_DO_WRITE_WAKEUP is set.
> 
> To complete the transmission of a SLIP frame to the TTY,
> slip_write_wakeup() must run at least once. Unfortunately, pty_write()
> also calls tty_wakeup() and when TTY_DO_WRITE_WAKEUP is set,
> slip_write_wakeup() is called. xleft is always zero and
> causes slip_write_wakeup() to complete the transmission and
> clears the TTY_DO_WRITE_WAKEUP flag. This can cause a truncated
> SLIP frame because any remaining characters will not get sent.
> 
> The wakeup event is unable to process the remaining characters
> because the TTY_DO_WRITE_WAKEUP flag has been cleared.
> 
> The code modification fixes the transmission segmentation
> mechanism by preventing pty_write() from calling
> slip_write_wakeup() by clearing TTY_DO_WRITE_WAKEUP before calling
> pty_write(). After pty_write() returns, TTY_DO_WRITE_WAKEUP is set
> to allow the TTY wakeup event to call slip_write_wakeup() to
> attempt to complete the transmission of the SLIP frame.
> 
> This may not be foolproof because a timeout is needed to
> break out of the cycle of transmission attempts. Note error
> codes from the TTY layer will break out of the cycle of
> transmission attempts.
> 
> Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/5] SLIP: Add error message for xleft non-zero
  2013-07-02 15:31 ` [PATCH 4/5] SLIP: Add error message for xleft non-zero Dean Jenkins
@ 2013-07-24 22:18   ` Greg Kroah-Hartman
  2013-07-24 22:41     ` Paul Bolle
  0 siblings, 1 reply; 14+ messages in thread
From: Greg Kroah-Hartman @ 2013-07-24 22:18 UTC (permalink / raw)
  To: Dean Jenkins; +Cc: Andre Naujoks, linux-kernel, Jiri Slaby

On Tue, Jul 02, 2013 at 04:31:33PM +0100, Dean Jenkins wrote:
> Add a printk to show when xleft is non-zero in sl_encaps.
> 
> The idea is to see whether a previous SLIP frame failed to be
> fully transmitted.
> 
> Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
> ---
>  drivers/net/slip/slip.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c
> index f7303e0..e2eff84 100644
> --- a/drivers/net/slip/slip.c
> +++ b/drivers/net/slip/slip.c
> @@ -395,6 +395,10 @@ static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
>  #endif
>  		count = slip_esc(p, sl->xbuff, len);
>  
> +	if (sl->xleft)
> +		printk(KERN_ERR "%s: ERROR: xleft is non-zero %d\n",
> +		       __func__, sl->xleft);

dev_err() perhaps?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/5] SLIP: Add error message for xleft non-zero
  2013-07-24 22:18   ` Greg Kroah-Hartman
@ 2013-07-24 22:41     ` Paul Bolle
  0 siblings, 0 replies; 14+ messages in thread
From: Paul Bolle @ 2013-07-24 22:41 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Dean Jenkins, Andre Naujoks, linux-kernel, Jiri Slaby

On Wed, 2013-07-24 at 15:18 -0700, Greg Kroah-Hartman wrote:
> On Tue, Jul 02, 2013 at 04:31:33PM +0100, Dean Jenkins wrote:
> > Add a printk to show when xleft is non-zero in sl_encaps.
> > 
> > The idea is to see whether a previous SLIP frame failed to be
> > fully transmitted.
> > 
> > Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
> > ---
> >  drivers/net/slip/slip.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c
> > index f7303e0..e2eff84 100644
> > --- a/drivers/net/slip/slip.c
> > +++ b/drivers/net/slip/slip.c
> > @@ -395,6 +395,10 @@ static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
> >  #endif
> >  		count = slip_esc(p, sl->xbuff, len);
> >  
> > +	if (sl->xleft)
> > +		printk(KERN_ERR "%s: ERROR: xleft is non-zero %d\n",
> > +		       __func__, sl->xleft);
> 
> dev_err() perhaps?

After looking at the commit explanation and the patch itself I wonder
why this should be printed at error level. Especially since patch 3/5
will set sl->xleft to zero immediately after.

So, dev_dbg() perhaps?

And can't this be merged into 3/5?


Paul Bolle  


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] Bluetooth: Add RFCOMM TTY write return error codes
  2013-07-02 15:31 ` [PATCH 1/5] Bluetooth: Add RFCOMM TTY write return error codes Dean Jenkins
@ 2013-07-24 23:12   ` Peter Hurley
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Hurley @ 2013-07-24 23:12 UTC (permalink / raw)
  To: Dean Jenkins; +Cc: Andre Naujoks, linux-kernel, Jiri Slaby, Greg Kroah-Hartman

On 07/02/2013 11:31 AM, Dean Jenkins wrote:
> It appears that rfcomm_tty_write() does not check that the
> passed in TTY device_data is not NULL and also does not check
> that the RFCOMM DLC serial data link pointer is not NULL.
>
> A kernel crash was observed whilst SLIP was bound to /dev/rfcomm0
> but the /dev/rfcomm0 had subsequently disconnected. Unfortunately,
> SLIP attempted to write to the now non-existant RFCOMM TTY device
> which caused a NULL pointer dereference because the device_data
> no longer existed.
>
> Therefore, add NULL pointer checks for the dev and dlc pointers
> and output kernel error debug to show that NULL had been detected.

Dean,

Sorry, I didn't see these until just now.

> Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
> ---
>   net/bluetooth/rfcomm/tty.c | 14 +++++++++++++-
>   1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/net/bluetooth/rfcomm/tty.c b/net/bluetooth/rfcomm/tty.c
> index b6e44ad..56d28d1 100644
> --- a/net/bluetooth/rfcomm/tty.c
> +++ b/net/bluetooth/rfcomm/tty.c
> @@ -761,12 +761,24 @@ static void rfcomm_tty_close(struct tty_struct *tty, struct file *filp)
>   static int rfcomm_tty_write(struct tty_struct *tty, const unsigned char *buf, int count)
>   {
>   	struct rfcomm_dev *dev = (struct rfcomm_dev *) tty->driver_data;
> -	struct rfcomm_dlc *dlc = dev->dlc;
> +	struct rfcomm_dlc *dlc;
>   	struct sk_buff *skb;
>   	int err = 0, sent = 0, size;
>
>   	BT_DBG("tty %p count %d", tty, count);
>
> +	if (!dev) {
> +		BT_ERR("RFCOMM TTY device data structure does not exist");
> +		return -ENODEV;
> +	}
> +
> +	dlc = dev->dlc;
> +
> +	if (!dlc) {
> +		BT_ERR("RFCOMM serial data link does not exist");
> +		return -ENOLINK;
> +	}
> +

A number of bugs in rfcomm contributed to the crash you observed.
Several other reporters have also noted crashes stemming from
device disconnect. The rfcomm device must not disappear while a tty
is in-use.

Fixes for these are in-progress in a patch series by Gianluca Anzolin
on linux-bluetooth ML.

Also, for the future, kernel bluetooth patches go to:
   Gustavo Padovan <gustavo@padovan.org>
with cc's to:
   Johan Hedberg <johan.hedberg@gmail.com>
   Marcel Holtmann <marcel@holtmann.org>
   linux-bluetooth@vger.kernel.org

Regards,
Peter Hurley


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 5/5] SLIP: Fix transmission segmentation mechanism
  2013-07-02 15:31 ` [PATCH 5/5] SLIP: Fix transmission segmentation mechanism Dean Jenkins
  2013-07-24 22:18   ` Greg Kroah-Hartman
@ 2013-07-25  0:13   ` Peter Hurley
  1 sibling, 0 replies; 14+ messages in thread
From: Peter Hurley @ 2013-07-25  0:13 UTC (permalink / raw)
  To: Dean Jenkins; +Cc: Andre Naujoks, linux-kernel, Jiri Slaby, Greg Kroah-Hartman

On 07/02/2013 11:31 AM, Dean Jenkins wrote:
> Binding SLIP with a PTY/TTY can cause truncated SLIP frames to be
> transmitted when the first write to the TTY fails to consume all
> the characters of the SLIP frame.
>
> Asynchronous to the write function is a wakeup event from the TTY
> that indicates the TTY can accept more data. The wakeup event
> calls tty_wakeup() which calls slip_write_wakeup() when
> TTY_DO_WRITE_WAKEUP is set.
>
> To complete the transmission of a SLIP frame to the TTY,
> slip_write_wakeup() must run at least once. Unfortunately, pty_write()
> also calls tty_wakeup() and when TTY_DO_WRITE_WAKEUP is set,
> slip_write_wakeup() is called. xleft is always zero and
> causes slip_write_wakeup() to complete the transmission and
> clears the TTY_DO_WRITE_WAKEUP flag. This can cause a truncated
> SLIP frame because any remaining characters will not get sent.
>
> The wakeup event is unable to process the remaining characters
> because the TTY_DO_WRITE_WAKEUP flag has been cleared.
>
> The code modification fixes the transmission segmentation
> mechanism by preventing pty_write() from calling
> slip_write_wakeup() by clearing TTY_DO_WRITE_WAKEUP before calling
> pty_write(). After pty_write() returns, TTY_DO_WRITE_WAKEUP is set
> to allow the TTY wakeup event to call slip_write_wakeup() to
> attempt to complete the transmission of the SLIP frame.
>
> This may not be foolproof because a timeout is needed to
> break out of the cycle of transmission attempts. Note error
> codes from the TTY layer will break out of the cycle of
> transmission attempts.
>
> Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
> ---
>   drivers/net/slip/slip.c | 29 ++++++++++++++++++++---------
>   1 file changed, 20 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c
> index e2eff84..0ded23d 100644
> --- a/drivers/net/slip/slip.c
> +++ b/drivers/net/slip/slip.c
> @@ -404,15 +404,13 @@ static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
>   	 */
>   	sl->xleft = 0;
>
> -	/* Order of next two lines is *very* important.
> -	 * When we are sending a little amount of data,
> -	 * the transfer may be completed inside the ops->write()
> -	 * routine, because it's running with interrupts enabled.
> -	 * In this case we *never* got WRITE_WAKEUP event,
> -	 * if we did not request it before write operation.
> -	 *       14 Oct 1994  Dmitry Gorodchanin.
> +	/* ensure slip_write_wakeup() does not run due to write()
> +	 * or write_wakeup event and this prevents slip_write_wakeup()
> +	 * responding to an out of date xleft value.
>   	 */
> -	set_bit(TTY_DO_WRITE_WAKEUP, &sl->tty->flags);
> +	clear_bit(TTY_DO_WRITE_WAKEUP, &sl->tty->flags);
> +
> +	/* attempt to write the SLIP frame to the TTY buffer */
>   	err = sl->tty->ops->write(sl->tty, sl->xbuff, count);
>
>   	if (err < 0) {
> @@ -432,6 +430,11 @@ static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
>   	/* VSV */
>   	clear_bit(SLF_OUTWAIT, &sl->flags);	/* reset outfill flag */
>   #endif
> +	/* xleft will be zero when all characters have been written.
> +	 * if xleft is positive then additional writes are needed.
> +	 * write_wakeup event is needed to complete the transmission.
> +	 */
> +	set_bit(TTY_DO_WRITE_WAKEUP, &sl->tty->flags);

What happens if the tty_wakeup() has already happened on another
CPU before TTY_DO_WRITE_WAKEUP was set?

>   }
>
>   /*
> @@ -447,15 +450,18 @@ static void slip_write_wakeup(struct tty_struct *tty)
>   	if (!sl || sl->magic != SLIP_MAGIC || !netif_running(sl->dev))
>   		return;
>
> +	clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
> +
>   	if (sl->xleft <= 0)  {
> +		/* Whole SLIP frame has been written. */
>   		/* Now serial buffer is almost free & we can start
>   		 * transmission of another packet */
>   		sl->dev->stats.tx_packets++;
> -		clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
>   		sl_unlock(sl);
>   		return;
>   	}
>
> +	/* attempt to write the remaining SLIP frame characters */
>   	err = tty->ops->write(tty, sl->xhead, sl->xleft);
>
>   	if (err < 0) {
> @@ -468,6 +474,11 @@ static void slip_write_wakeup(struct tty_struct *tty)
>
>   	sl->xleft -= actual;
>   	sl->xhead += actual;
> +
> +	/* allow the next tty wakeup event to attempt to complete
> +	 * the transmission
> +	 */
> +	set_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);

Same here.

>   }
>
>   static void sl_tx_timeout(struct net_device *dev)
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/5] SLIP: Prevent recursion stack overflow and scheduler crash
  2013-07-02 15:31 ` [PATCH 3/5] SLIP: Prevent recursion stack overflow and scheduler crash Dean Jenkins
@ 2013-07-25  1:12   ` Peter Hurley
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Hurley @ 2013-07-25  1:12 UTC (permalink / raw)
  To: Dean Jenkins; +Cc: Andre Naujoks, linux-kernel, Jiri Slaby, Greg Kroah-Hartman

On 07/02/2013 11:31 AM, Dean Jenkins wrote:
> This is an issue when SLIP is bound to a PTY/TTY. If
> TTY_DO_WRITE_WAKEUP is set, pty_write() calls tty_wakeup()
> then slip_write_wakeup() can be called. slip_write_wakeup() can
> be called by pty_write(). This is a recursion loop.
>
> pty_write() is called in sl_encaps().
> slip_write_wakeup() can be called by the TTY wakeup event.
>
> The pty_write() call in sl_encaps() will also call
> slip_write_wakeup() but xleft has not been updated and contains
> the value from the previous SLIP frame transmission. xleft is zero
> unless the previous SLIP frame failed to be fully transmitted in
> which case xleft has a positive value. A failed transmission causes
> the next SLIP frame pending transmission to cause a crash.
>
> In the failure case when xleft is positive in slip_write_wakeup(),
> recursion causes the stack to overflow and task structures located
> near the stack are corrupted by the stack overflow. The corrupted
> task structures crash the kernel's scheduler and the system
> crashes with exception handlers crashing and the emergency reboot
> fails.
>
> The recursion loop is:
> slip_write_wakeup()-->pty_write()-->tty_wakeup()-->slip_write_wakeup()
> etc.

pty_write() calling tty_wakeup() directly is a no-no.

This is fixed in tty-next.

Regards,
Peter Hurley

> Therefore ensure xleft is zero before writing the SLIP frame to the
> PTY/TTY layers. This prevents the xleft value of the previous SLIP
> frame from interfering with the slip_write_wakeup() execution when
> SLIP is bound to a PTY/TTY.
>
> Note the transmission segmentation mechanism is broken and only a
> single call to the write() function pointer will take place per
> SLIP frame. This could cause missing or truncated SLIP frames to
> be transmitted when the write() function fails to write the complete
> frame. In other words the TTY wakeup event does nothing because
> the TTY_DO_WRITE_WAKEUP flag has been cleared.
>
> Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
> ---
>   drivers/net/slip/slip.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c
> index bed819f..f7303e0 100644
> --- a/drivers/net/slip/slip.c
> +++ b/drivers/net/slip/slip.c
> @@ -395,6 +395,11 @@ static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
>   #endif
>   		count = slip_esc(p, sl->xbuff, len);
>
> +	/* ensure xleft set by the previous SLIP frame is zero for this frame
> +	 * otherwise slip_write_wakeup() can cause a recursive loop.
> +	 */
> +	sl->xleft = 0;
> +
>   	/* Order of next two lines is *very* important.
>   	 * When we are sending a little amount of data,
>   	 * the transfer may be completed inside the ops->write()
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 4/5] SLIP: Add error message for xleft non-zero
  2013-06-19 15:34 [PATCH 0/5] SLIP SLIP-Improve robustness to crashing Dean Jenkins
@ 2013-06-19 15:34 ` Dean Jenkins
  0 siblings, 0 replies; 14+ messages in thread
From: Dean Jenkins @ 2013-06-19 15:34 UTC (permalink / raw)
  To: davem, netdev

Add a printk to show when xleft is non-zero in sl_encaps.

The idea is to see whether a previous SLIP frame failed to be
fully transmitted.

Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
---
 drivers/net/slip/slip.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c
index f7303e0..e2eff84 100644
--- a/drivers/net/slip/slip.c
+++ b/drivers/net/slip/slip.c
@@ -395,6 +395,10 @@ static void sl_encaps(struct slip *sl, unsigned char *icp, int len)
 #endif
 		count = slip_esc(p, sl->xbuff, len);
 
+	if (sl->xleft)
+		printk(KERN_ERR "%s: ERROR: xleft is non-zero %d\n",
+		       __func__, sl->xleft);
+
 	/* ensure xleft set by the previous SLIP frame is zero for this frame
 	 * otherwise slip_write_wakeup() can cause a recursive loop.
 	 */
-- 
1.8.1.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-07-25  1:12 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-02 15:31 [PATCH 0/5] SLIP SLIP-Improve robustness to crashing Dean Jenkins
2013-07-02 15:31 ` [PATCH 1/5] Bluetooth: Add RFCOMM TTY write return error codes Dean Jenkins
2013-07-24 23:12   ` Peter Hurley
2013-07-02 15:31 ` [PATCH 2/5] SLIP: Handle error codes from the TTY layer Dean Jenkins
2013-07-24 22:18   ` Greg Kroah-Hartman
2013-07-02 15:31 ` [PATCH 3/5] SLIP: Prevent recursion stack overflow and scheduler crash Dean Jenkins
2013-07-25  1:12   ` Peter Hurley
2013-07-02 15:31 ` [PATCH 4/5] SLIP: Add error message for xleft non-zero Dean Jenkins
2013-07-24 22:18   ` Greg Kroah-Hartman
2013-07-24 22:41     ` Paul Bolle
2013-07-02 15:31 ` [PATCH 5/5] SLIP: Fix transmission segmentation mechanism Dean Jenkins
2013-07-24 22:18   ` Greg Kroah-Hartman
2013-07-25  0:13   ` Peter Hurley
  -- strict thread matches above, loose matches on Subject: below --
2013-06-19 15:34 [PATCH 0/5] SLIP SLIP-Improve robustness to crashing Dean Jenkins
2013-06-19 15:34 ` [PATCH 4/5] SLIP: Add error message for xleft non-zero Dean Jenkins

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.