All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt
@ 2013-02-25 16:38 dean_jenkins
  2013-02-25 16:38 ` [PATCH 1/6] Bluetooth: Avoid rfcomm_session_timeout using freed session dean_jenkins
                   ` (6 more replies)
  0 siblings, 7 replies; 12+ messages in thread
From: dean_jenkins @ 2013-02-25 16:38 UTC (permalink / raw)
  To: linux-bluetooth; +Cc: marcel, gustavo

This patchset consists of the following 6 patches that reworks the RFCOMM
session refcnt by adding handling for a NULL RFCOMM session pointer when the
session has been deleted followed by removing the now redundant refcnt
mechanism:

----------------------------------------------------------------
Dean Jenkins (6):
      Bluetooth: Avoid rfcomm_session_timeout using freed session
      Bluetooth: Check rfcomm session and DLC exists on socket close
      Bluetooth: Return RFCOMM session ptrs to avoid freed session
      Bluetooth: Remove RFCOMM session refcnt
      Bluetooth: Remove redundant call to rfcomm_send_disc
      Bluetooth: Remove redundant RFCOMM BT_CLOSED settings

 include/net/bluetooth/rfcomm.h |    6 ----
 net/bluetooth/rfcomm/core.c    |  163 ++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------------------------------
 2 files changed, 76 insertions(+), 93 deletions(-)

The motivation for the change was RFCOMM kernel crashes in a 2.6.34 based ARM
SMP kernel. It is noted that RFCOMM in kernel 3.7 still contains the RFCOMM
session refcnt so the patches have been forward ported. Crashes were due to
malfunctions of the RFCOMM session refcnt such as premature deletion of the
session and double deletion of the session. Some crashes were caused by
the interaction of 2 kernel threads under very high processor loading.

Note that the refcnt is incorrectly initialised, it should be set to 1 before
allowing the session structure to be used, the old code used a value of 0 and
this contributes to a weak implementation of the refcnt.

In kernels later than 2.6.34, the threading model for RFCOMM was modified to
use work queues and this reduced the probability of a RFCOMM crash. This
helps to explain why the reports of RFCOMM crashes went away. However, the
implementation of the RFCOMM session refcnt is weak and is in fact logically
incorrect by using the RFCOMM data channel "initiator" flag. The control
channel disconnection procedures do not care about how the control channel
was established eg. by host or remote device request.

The patches also make it clear when the RFCOMM session has been deleted by
using a NULL pointer. The old code has a possibility to access freed RFCOMM
session structures because there are multiple local copies of the pointer. The
patches ensure that local copies of the pointer are set to NULL when the
session has been deleted so avoiding any possibility of accessing freed memory.

Therefore, the patchset cleans up the disconnection of the RFCOMM session
control channel and now the code is deterministic and is understandable by
code review.

These patches were lightly tested against kernel 3.7.3 on a x86 PC. The design
of the patches were proven on a commercial project using a modified embedded ARM
SMP kernel 2.6.34.13. All previously observed RFCOMM control channel
disconnection kernel crashes on ARM 2.6.34.13 went away with the patches
applied.

Patch summaries are shown below:

0001: Bluetooth: Avoid rfcomm_session_timeout using freed session
    
    Use del_timer_sync() instead of del_timer() as this ensures
    that rfcomm_session_timeout() is not running on a different
    CPU when rfcomm_session_put() is called. This avoids a race
    condition on SMP systems because potentially
    rfcomm_session_timeout() could reuse the freed RFCOMM session
    structure caused by the execution of rfcomm_session_put().
    
    Note that this modification makes the reason for the RFCOMM
    session refcnt mechanism redundant.
    

0002: Bluetooth: Check rfcomm session and DLC exists on socket close
    
    A race condition exists between near simultaneous asynchronous
    DLC data channel disconnection requests from the host and remote device.
    This causes the socket layer to request a socket shutdown at the same
    time the rfcomm core is processing the disconnect request from the remote
    device.
    
    The socket layer retains a copy of a struct rfcomm_dlc d pointer.
    The d pointer refers to a copy of a struct rfcomm_session.
    When the socket layer thread performs a socket shutdown, the thread
    may wait on a rfcomm lock in rfcomm_dlc_close(). This means that
    whilst the thread waits, the rfcomm_session and/or rfcomm_dlc structures
    pointed to by d maybe freed due to rfcomm core handling. Consequently,
    when the rfcomm lock becomes available and the thread runs, a
    malfunction could occur as a freed rfcomm_session structure and/or a
    freed rfcomm_dlc structure will be erroneously accessed.
    
    Therefore, after the rfcomm lock is acquired, check that the struct
    rfcomm_session is still valid by searching the rfcomm session list.
    If the session is valid then validate the d pointer by searching the
    rfcomm session list of active DLCs for the rfcomm_dlc structure
    pointed by d.
    

0003: Bluetooth: Return RFCOMM session ptrs to avoid freed session
    
    Unfortunately, the design retains local copies of the s RFCOMM
    session pointer in various code blocks and this invites the erroneous
    access to a freed RFCOMM session structure.
    
    Therefore, return the RFCOMM session pointer back up the call stack
    to avoid accessing a freed RFCOMM session structure. When the RFCOMM
    session is deleted, NULL is passed up the call stack.
    
    If active DLCs exist when the rfcomm session is terminating,
    avoid a memory leak of rfcomm_dlc structures by ensuring that
    rfcomm_session_close() is used instead of rfcomm_session_del().
    

0004: Bluetooth: Remove RFCOMM session refcnt
    
    Previous commits have improved the handling of the RFCOMM session
    timer and the RFCOMM session pointers such that freed RFCOMM
    session structures should no longer be erroneously accessed. The
    RFCOMM session refcnt now has no purpose and will be deleted by
    this commit.
    
    Note that the RFCOMM session is now deleted as soon as the
    RFCOMM control channel link is no longer required. This makes the
    lifetime of the RFCOMM session deterministic and absolute.
    Previously with the refcnt, there was uncertainty about when
    the session structure would be deleted because the relative
    refcnt prevented the session structure from being deleted at will.
    
    It was noted that the refcnt could malfunction under very heavy
    real-time processor loading in embedded SMP environments. This
    could cause premature RFCOMM session deletion or double session
    deletion that could result in kernel crashes. Removal of the
    refcnt prevents this issue.
    
    There are 4 connection / disconnection RFCOMM session scenarios:
    host initiated control link ---> host disconnected control link
    host initiated ctrl link ---> remote device disconnected ctrl link
    remote device initiated ctrl link ---> host disconnected ctrl link
    remote device initiated ctrl link ---> remote device disc'ed ctrl link
    
    The control channel connection procedures are independent of the
    disconnection procedures. Strangely, the RFCOMM session refcnt was
    applying special treatment so erroneously combining connection and
    disconnection events. This commit fixes this issue by removing
    some session code that used the "initiator" member of the session
    structure that was intended for use with the data channels.
    

0005: Bluetooth: Remove redundant call to rfcomm_send_disc
    
    In rfcomm_session_del() remove the redundant call to
    rfcomm_send_disc() because it is not possible for the
    session to be in BT_CONNECTED state during deletion
    of the session.
    

0006: Bluetooth: Remove redundant RFCOMM BT_CLOSED settings
    
    rfcomm_session_close() sets the RFCOMM session state to BT_CLOSED.
    However, in multiple places immediately before the function is
    called, the RFCOMM session is set to BT_CLOSED. Therefore,
    remove these unnecessary state settings.
    

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/6] Bluetooth: Avoid rfcomm_session_timeout using freed session
  2013-02-25 16:38 [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt dean_jenkins
@ 2013-02-25 16:38 ` dean_jenkins
  2013-02-25 16:38 ` [PATCH 2/6] Bluetooth: Check rfcomm session and DLC exists on socket close dean_jenkins
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 12+ messages in thread
From: dean_jenkins @ 2013-02-25 16:38 UTC (permalink / raw)
  To: linux-bluetooth; +Cc: marcel, gustavo

From: Dean Jenkins <Dean_Jenkins@mentor.com>

Use del_timer_sync() instead of del_timer() as this ensures
that rfcomm_session_timeout() is not running on a different
CPU when rfcomm_session_put() is called. This avoids a race
condition on SMP systems because potentially
rfcomm_session_timeout() could reuse the freed RFCOMM session
structure caused by the execution of rfcomm_session_put().

Note that this modification makes the reason for the RFCOMM
session refcnt mechanism redundant.

Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
---
 net/bluetooth/rfcomm/core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index 201fdf7..8780e67 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -257,7 +257,7 @@ static void rfcomm_session_clear_timer(struct rfcomm_session *s)
 {
 	BT_DBG("session %p state %ld", s, s->state);
 
-	if (timer_pending(&s->timer) && del_timer(&s->timer))
+	if (del_timer_sync(&s->timer))
 		rfcomm_session_put(s);
 }
 
-- 
1.7.10.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/6] Bluetooth: Check rfcomm session and DLC exists on socket close
  2013-02-25 16:38 [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt dean_jenkins
  2013-02-25 16:38 ` [PATCH 1/6] Bluetooth: Avoid rfcomm_session_timeout using freed session dean_jenkins
@ 2013-02-25 16:38 ` dean_jenkins
  2013-02-26 19:12   ` Gustavo Padovan
  2013-02-25 16:38 ` [PATCH 3/6] Bluetooth: Return RFCOMM session ptrs to avoid freed session dean_jenkins
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: dean_jenkins @ 2013-02-25 16:38 UTC (permalink / raw)
  To: linux-bluetooth; +Cc: marcel, gustavo

From: Dean Jenkins <Dean_Jenkins@mentor.com>

A race condition exists between near simultaneous asynchronous
DLC data channel disconnection requests from the host and remote device.
This causes the socket layer to request a socket shutdown at the same
time the rfcomm core is processing the disconnect request from the remote
device.

The socket layer retains a copy of a struct rfcomm_dlc d pointer.
The d pointer refers to a copy of a struct rfcomm_session.
When the socket layer thread performs a socket shutdown, the thread
may wait on a rfcomm lock in rfcomm_dlc_close(). This means that
whilst the thread waits, the rfcomm_session and/or rfcomm_dlc structures
pointed to by d maybe freed due to rfcomm core handling. Consequently,
when the rfcomm lock becomes available and the thread runs, a
malfunction could occur as a freed rfcomm_session structure and/or a
freed rfcomm_dlc structure will be erroneously accessed.

Therefore, after the rfcomm lock is acquired, check that the struct
rfcomm_session is still valid by searching the rfcomm session list.
If the session is valid then validate the d pointer by searching the
rfcomm session list of active DLCs for the rfcomm_dlc structure
pointed by d.

Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
---
 net/bluetooth/rfcomm/core.c |   26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index 8780e67..af0c26d 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -493,12 +493,34 @@ static int __rfcomm_dlc_close(struct rfcomm_dlc *d, int err)
 
 int rfcomm_dlc_close(struct rfcomm_dlc *d, int err)
 {
-	int r;
+	int r = 0;
+	struct rfcomm_dlc *d_list;
+	struct rfcomm_session *s, *s_list;
+
+	BT_DBG("dlc %p state %ld dlci %d err %d", d, d->state, d->dlci, err);
 
 	rfcomm_lock();
 
-	r = __rfcomm_dlc_close(d, err);
+	s = d->session;
+	if (!s)
+		goto no_session;
+
+	/* after waiting on the mutex check the session still exists */
+	list_for_each_entry(s_list, &session_list, list) {
+		if (s_list == s) {
+			/* after waiting on the mutex, */
+			/* check the dlc still exists */
+			list_for_each_entry(d_list, &s->dlcs, list) {
+				if (d_list == d) {
+					r = __rfcomm_dlc_close(d, err);
+					break;
+				}
+			}
+			break;
+		}
+	}
 
+no_session:
 	rfcomm_unlock();
 	return r;
 }
-- 
1.7.10.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/6] Bluetooth: Return RFCOMM session ptrs to avoid freed session
  2013-02-25 16:38 [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt dean_jenkins
  2013-02-25 16:38 ` [PATCH 1/6] Bluetooth: Avoid rfcomm_session_timeout using freed session dean_jenkins
  2013-02-25 16:38 ` [PATCH 2/6] Bluetooth: Check rfcomm session and DLC exists on socket close dean_jenkins
@ 2013-02-25 16:38 ` dean_jenkins
  2013-02-26 19:21   ` Gustavo Padovan
  2013-02-25 16:38 ` [PATCH 4/6] Bluetooth: Remove RFCOMM session refcnt dean_jenkins
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: dean_jenkins @ 2013-02-25 16:38 UTC (permalink / raw)
  To: linux-bluetooth; +Cc: marcel, gustavo

From: Dean Jenkins <Dean_Jenkins@mentor.com>

Unfortunately, the design retains local copies of the s RFCOMM
session pointer in various code blocks and this invites the erroneous
access to a freed RFCOMM session structure.

Therefore, return the RFCOMM session pointer back up the call stack
to avoid accessing a freed RFCOMM session structure. When the RFCOMM
session is deleted, NULL is passed up the call stack.

If active DLCs exist when the rfcomm session is terminating,
avoid a memory leak of rfcomm_dlc structures by ensuring that
rfcomm_session_close() is used instead of rfcomm_session_del().

Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
---
 include/net/bluetooth/rfcomm.h |    3 +-
 net/bluetooth/rfcomm/core.c    |  106 +++++++++++++++++++++-------------------
 2 files changed, 58 insertions(+), 51 deletions(-)

diff --git a/include/net/bluetooth/rfcomm.h b/include/net/bluetooth/rfcomm.h
index e2e3eca..a4e38ea 100644
--- a/include/net/bluetooth/rfcomm.h
+++ b/include/net/bluetooth/rfcomm.h
@@ -278,7 +278,8 @@ void   rfcomm_session_getaddr(struct rfcomm_session *s, bdaddr_t *src,
 
 static inline void rfcomm_session_hold(struct rfcomm_session *s)
 {
-	atomic_inc(&s->refcnt);
+	if (s)
+		atomic_inc(&s->refcnt);
 }
 
 /* ---- RFCOMM sockets ---- */
diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index af0c26d..60d2f1a 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -69,7 +69,7 @@ static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
 							u8 sec_level,
 							int *err);
 static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst);
-static void rfcomm_session_del(struct rfcomm_session *s);
+static struct rfcomm_session *rfcomm_session_del(struct rfcomm_session *s);
 
 /* ---- RFCOMM frame parsing macros ---- */
 #define __get_dlci(b)     ((b & 0xfc) >> 2)
@@ -108,10 +108,12 @@ static void rfcomm_schedule(void)
 	wake_up_process(rfcomm_thread);
 }
 
-static void rfcomm_session_put(struct rfcomm_session *s)
+static struct rfcomm_session *rfcomm_session_put(struct rfcomm_session *s)
 {
-	if (atomic_dec_and_test(&s->refcnt))
-		rfcomm_session_del(s);
+	if (s && atomic_dec_and_test(&s->refcnt))
+		s = rfcomm_session_del(s);
+
+	return s;
 }
 
 /* ---- RFCOMM FCS computation ---- */
@@ -631,7 +633,7 @@ static struct rfcomm_session *rfcomm_session_add(struct socket *sock, int state)
 	return s;
 }
 
-static void rfcomm_session_del(struct rfcomm_session *s)
+static struct rfcomm_session *rfcomm_session_del(struct rfcomm_session *s)
 {
 	int state = s->state;
 
@@ -648,6 +650,8 @@ static void rfcomm_session_del(struct rfcomm_session *s)
 
 	if (state != BT_LISTEN)
 		module_put(THIS_MODULE);
+
+	return NULL;
 }
 
 static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst)
@@ -666,7 +670,8 @@ static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst)
 	return NULL;
 }
 
-static void rfcomm_session_close(struct rfcomm_session *s, int err)
+static struct rfcomm_session *rfcomm_session_close(struct rfcomm_session *s,
+							int err)
 {
 	struct rfcomm_dlc *d;
 	struct list_head *p, *n;
@@ -685,7 +690,7 @@ static void rfcomm_session_close(struct rfcomm_session *s, int err)
 	}
 
 	rfcomm_session_clear_timer(s);
-	rfcomm_session_put(s);
+	return rfcomm_session_put(s);
 }
 
 static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
@@ -737,8 +742,7 @@ static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
 	if (*err == 0 || *err == -EINPROGRESS)
 		return s;
 
-	rfcomm_session_del(s);
-	return NULL;
+	return rfcomm_session_del(s);
 
 failed:
 	sock_release(sock);
@@ -1127,7 +1131,7 @@ static void rfcomm_make_uih(struct sk_buff *skb, u8 addr)
 }
 
 /* ---- RFCOMM frame reception ---- */
-static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
+static struct rfcomm_session *rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
 {
 	BT_DBG("session %p state %ld dlci %d", s, s->state, dlci);
 
@@ -1136,7 +1140,7 @@ static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
 		struct rfcomm_dlc *d = rfcomm_dlc_get(s, dlci);
 		if (!d) {
 			rfcomm_send_dm(s, dlci);
-			return 0;
+			return s;
 		}
 
 		switch (d->state) {
@@ -1172,25 +1176,14 @@ static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
 			break;
 
 		case BT_DISCONN:
-			/* rfcomm_session_put is called later so don't do
-			 * anything here otherwise we will mess up the session
-			 * reference counter:
-			 *
-			 * (a) when we are the initiator dlc_unlink will drive
-			 * the reference counter to 0 (there is no initial put
-			 * after session_add)
-			 *
-			 * (b) when we are not the initiator rfcomm_rx_process
-			 * will explicitly call put to balance the initial hold
-			 * done after session add.
-			 */
+			s = rfcomm_session_close(s, ECONNRESET);
 			break;
 		}
 	}
-	return 0;
+	return s;
 }
 
-static int rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
+static struct rfcomm_session *rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
 {
 	int err = 0;
 
@@ -1215,12 +1208,13 @@ static int rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
 			err = ECONNRESET;
 
 		s->state = BT_CLOSED;
-		rfcomm_session_close(s, err);
+		s = rfcomm_session_close(s, err);
 	}
-	return 0;
+	return s;
 }
 
-static int rfcomm_recv_disc(struct rfcomm_session *s, u8 dlci)
+static struct rfcomm_session *rfcomm_recv_disc(struct rfcomm_session *s,
+						u8 dlci)
 {
 	int err = 0;
 
@@ -1250,10 +1244,9 @@ static int rfcomm_recv_disc(struct rfcomm_session *s, u8 dlci)
 			err = ECONNRESET;
 
 		s->state = BT_CLOSED;
-		rfcomm_session_close(s, err);
+		s = rfcomm_session_close(s, err);
 	}
-
-	return 0;
+	return s;
 }
 
 void rfcomm_dlc_accept(struct rfcomm_dlc *d)
@@ -1674,11 +1667,18 @@ drop:
 	return 0;
 }
 
-static int rfcomm_recv_frame(struct rfcomm_session *s, struct sk_buff *skb)
+static struct rfcomm_session *rfcomm_recv_frame(struct rfcomm_session *s,
+							struct sk_buff *skb)
 {
 	struct rfcomm_hdr *hdr = (void *) skb->data;
 	u8 type, dlci, fcs;
 
+	if (!s) {
+		/* no session, so free socket data */
+		kfree_skb(skb);
+		return s;
+	}
+
 	dlci = __get_dlci(hdr->addr);
 	type = __get_type(hdr->ctrl);
 
@@ -1689,7 +1689,7 @@ static int rfcomm_recv_frame(struct rfcomm_session *s, struct sk_buff *skb)
 	if (__check_fcs(skb->data, type, fcs)) {
 		BT_ERR("bad checksum in packet");
 		kfree_skb(skb);
-		return -EILSEQ;
+		return s;
 	}
 
 	if (__test_ea(hdr->len))
@@ -1705,22 +1705,23 @@ static int rfcomm_recv_frame(struct rfcomm_session *s, struct sk_buff *skb)
 
 	case RFCOMM_DISC:
 		if (__test_pf(hdr->ctrl))
-			rfcomm_recv_disc(s, dlci);
+			s = rfcomm_recv_disc(s, dlci);
 		break;
 
 	case RFCOMM_UA:
 		if (__test_pf(hdr->ctrl))
-			rfcomm_recv_ua(s, dlci);
+			s = rfcomm_recv_ua(s, dlci);
 		break;
 
 	case RFCOMM_DM:
-		rfcomm_recv_dm(s, dlci);
+		s = rfcomm_recv_dm(s, dlci);
 		break;
 
 	case RFCOMM_UIH:
-		if (dlci)
-			return rfcomm_recv_data(s, dlci, __test_pf(hdr->ctrl), skb);
-
+		if (dlci) {
+			rfcomm_recv_data(s, dlci, __test_pf(hdr->ctrl), skb);
+			return s;
+		}
 		rfcomm_recv_mcc(s, skb);
 		break;
 
@@ -1729,7 +1730,7 @@ static int rfcomm_recv_frame(struct rfcomm_session *s, struct sk_buff *skb)
 		break;
 	}
 	kfree_skb(skb);
-	return 0;
+	return s;
 }
 
 /* ---- Connection and data processing ---- */
@@ -1866,7 +1867,7 @@ static void rfcomm_process_dlcs(struct rfcomm_session *s)
 	}
 }
 
-static void rfcomm_process_rx(struct rfcomm_session *s)
+static struct rfcomm_session *rfcomm_process_rx(struct rfcomm_session *s)
 {
 	struct socket *sock = s->sock;
 	struct sock *sk = sock->sk;
@@ -1878,17 +1879,20 @@ static void rfcomm_process_rx(struct rfcomm_session *s)
 	while ((skb = skb_dequeue(&sk->sk_receive_queue))) {
 		skb_orphan(skb);
 		if (!skb_linearize(skb))
-			rfcomm_recv_frame(s, skb);
+			s = rfcomm_recv_frame(s, skb);
 		else
 			kfree_skb(skb);
 	}
 
-	if (sk->sk_state == BT_CLOSED) {
+	if (s && (sk->sk_state == BT_CLOSED)) {
 		if (!s->initiator)
-			rfcomm_session_put(s);
+			s = rfcomm_session_put(s);
 
-		rfcomm_session_close(s, sk->sk_err);
+		if (s)
+			s = rfcomm_session_close(s, sk->sk_err);
 	}
+
+	return s;
 }
 
 static void rfcomm_accept_connection(struct rfcomm_session *s)
@@ -1925,7 +1929,7 @@ static void rfcomm_accept_connection(struct rfcomm_session *s)
 		sock_release(nsock);
 }
 
-static void rfcomm_check_connection(struct rfcomm_session *s)
+static struct rfcomm_session *rfcomm_check_connection(struct rfcomm_session *s)
 {
 	struct sock *sk = s->sock->sk;
 
@@ -1944,9 +1948,10 @@ static void rfcomm_check_connection(struct rfcomm_session *s)
 
 	case BT_CLOSED:
 		s->state = BT_CLOSED;
-		rfcomm_session_close(s, sk->sk_err);
+		s = rfcomm_session_close(s, sk->sk_err);
 		break;
 	}
+	return s;
 }
 
 static void rfcomm_process_sessions(void)
@@ -1975,15 +1980,16 @@ static void rfcomm_process_sessions(void)
 
 		switch (s->state) {
 		case BT_BOUND:
-			rfcomm_check_connection(s);
+			s = rfcomm_check_connection(s);
 			break;
 
 		default:
-			rfcomm_process_rx(s);
+			s = rfcomm_process_rx(s);
 			break;
 		}
 
-		rfcomm_process_dlcs(s);
+		if (s)
+			rfcomm_process_dlcs(s);
 
 		rfcomm_session_put(s);
 	}
-- 
1.7.10.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/6] Bluetooth: Remove RFCOMM session refcnt
  2013-02-25 16:38 [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt dean_jenkins
                   ` (2 preceding siblings ...)
  2013-02-25 16:38 ` [PATCH 3/6] Bluetooth: Return RFCOMM session ptrs to avoid freed session dean_jenkins
@ 2013-02-25 16:38 ` dean_jenkins
  2013-02-25 23:08   ` David Herrmann
  2013-02-25 16:38 ` [PATCH 5/6] Bluetooth: Remove redundant call to rfcomm_send_disc dean_jenkins
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 12+ messages in thread
From: dean_jenkins @ 2013-02-25 16:38 UTC (permalink / raw)
  To: linux-bluetooth; +Cc: marcel, gustavo

From: Dean Jenkins <Dean_Jenkins@mentor.com>

Previous commits have improved the handling of the RFCOMM session
timer and the RFCOMM session pointers such that freed RFCOMM
session structures should no longer be erroneously accessed. The
RFCOMM session refcnt now has no purpose and will be deleted by
this commit.

Note that the RFCOMM session is now deleted as soon as the
RFCOMM control channel link is no longer required. This makes the
lifetime of the RFCOMM session deterministic and absolute.
Previously with the refcnt, there was uncertainty about when
the session structure would be deleted because the relative
refcnt prevented the session structure from being deleted at will.

It was noted that the refcnt could malfunction under very heavy
real-time processor loading in embedded SMP environments. This
could cause premature RFCOMM session deletion or double session
deletion that could result in kernel crashes. Removal of the
refcnt prevents this issue.

There are 4 connection / disconnection RFCOMM session scenarios:
host initiated control link ---> host disconnected control link
host initiated ctrl link ---> remote device disconnected ctrl link
remote device initiated ctrl link ---> host disconnected ctrl link
remote device initiated ctrl link ---> remote device disc'ed ctrl link

The control channel connection procedures are independent of the
disconnection procedures. Strangely, the RFCOMM session refcnt was
applying special treatment so erroneously combining connection and
disconnection events. This commit fixes this issue by removing
some session code that used the "initiator" member of the session
structure that was intended for use with the data channels.

Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
---
 include/net/bluetooth/rfcomm.h |    7 -------
 net/bluetooth/rfcomm/core.c    |   43 +++++-----------------------------------
 2 files changed, 5 insertions(+), 45 deletions(-)

diff --git a/include/net/bluetooth/rfcomm.h b/include/net/bluetooth/rfcomm.h
index a4e38ea..7afd419 100644
--- a/include/net/bluetooth/rfcomm.h
+++ b/include/net/bluetooth/rfcomm.h
@@ -158,7 +158,6 @@ struct rfcomm_session {
 	struct timer_list timer;
 	unsigned long    state;
 	unsigned long    flags;
-	atomic_t         refcnt;
 	int              initiator;
 
 	/* Default DLC parameters */
@@ -276,12 +275,6 @@ static inline void rfcomm_dlc_unthrottle(struct rfcomm_dlc *d)
 void   rfcomm_session_getaddr(struct rfcomm_session *s, bdaddr_t *src,
 								bdaddr_t *dst);
 
-static inline void rfcomm_session_hold(struct rfcomm_session *s)
-{
-	if (s)
-		atomic_inc(&s->refcnt);
-}
-
 /* ---- RFCOMM sockets ---- */
 struct sockaddr_rc {
 	sa_family_t	rc_family;
diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index 60d2f1a..ecef9bc 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -108,14 +108,6 @@ static void rfcomm_schedule(void)
 	wake_up_process(rfcomm_thread);
 }
 
-static struct rfcomm_session *rfcomm_session_put(struct rfcomm_session *s)
-{
-	if (s && atomic_dec_and_test(&s->refcnt))
-		s = rfcomm_session_del(s);
-
-	return s;
-}
-
 /* ---- RFCOMM FCS computation ---- */
 
 /* reversed, 8-bit, poly=0x07 */
@@ -251,16 +243,14 @@ static void rfcomm_session_set_timer(struct rfcomm_session *s, long timeout)
 {
 	BT_DBG("session %p state %ld timeout %ld", s, s->state, timeout);
 
-	if (!mod_timer(&s->timer, jiffies + timeout))
-		rfcomm_session_hold(s);
+	mod_timer(&s->timer, jiffies + timeout);
 }
 
 static void rfcomm_session_clear_timer(struct rfcomm_session *s)
 {
 	BT_DBG("session %p state %ld", s, s->state);
 
-	if (del_timer_sync(&s->timer))
-		rfcomm_session_put(s);
+	del_timer_sync(&s->timer);
 }
 
 /* ---- RFCOMM DLCs ---- */
@@ -338,8 +328,6 @@ static void rfcomm_dlc_link(struct rfcomm_session *s, struct rfcomm_dlc *d)
 {
 	BT_DBG("dlc %p session %p", d, s);
 
-	rfcomm_session_hold(s);
-
 	rfcomm_session_clear_timer(s);
 	rfcomm_dlc_hold(d);
 	list_add(&d->list, &s->dlcs);
@@ -358,8 +346,6 @@ static void rfcomm_dlc_unlink(struct rfcomm_dlc *d)
 
 	if (list_empty(&s->dlcs))
 		rfcomm_session_set_timer(s, RFCOMM_IDLE_TIMEOUT);
-
-	rfcomm_session_put(s);
 }
 
 static struct rfcomm_dlc *rfcomm_dlc_get(struct rfcomm_session *s, u8 dlci)
@@ -678,8 +664,6 @@ static struct rfcomm_session *rfcomm_session_close(struct rfcomm_session *s,
 
 	BT_DBG("session %p state %ld err %d", s, s->state, err);
 
-	rfcomm_session_hold(s);
-
 	s->state = BT_CLOSED;
 
 	/* Close all dlcs */
@@ -690,7 +674,7 @@ static struct rfcomm_session *rfcomm_session_close(struct rfcomm_session *s,
 	}
 
 	rfcomm_session_clear_timer(s);
-	return rfcomm_session_put(s);
+	return rfcomm_session_del(s);
 }
 
 static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
@@ -1884,13 +1868,8 @@ static struct rfcomm_session *rfcomm_process_rx(struct rfcomm_session *s)
 			kfree_skb(skb);
 	}
 
-	if (s && (sk->sk_state == BT_CLOSED)) {
-		if (!s->initiator)
-			s = rfcomm_session_put(s);
-
-		if (s)
-			s = rfcomm_session_close(s, sk->sk_err);
-	}
+	if (s && (sk->sk_state == BT_CLOSED))
+		s = rfcomm_session_close(s, sk->sk_err);
 
 	return s;
 }
@@ -1917,8 +1896,6 @@ static void rfcomm_accept_connection(struct rfcomm_session *s)
 
 	s = rfcomm_session_add(nsock, BT_OPEN);
 	if (s) {
-		rfcomm_session_hold(s);
-
 		/* We should adjust MTU on incoming sessions.
 		 * L2CAP MTU minus UIH header and FCS. */
 		s->mtu = min(l2cap_pi(nsock->sk)->chan->omtu,
@@ -1967,7 +1944,6 @@ static void rfcomm_process_sessions(void)
 		if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
 			s->state = BT_DISCONN;
 			rfcomm_send_disc(s, 0);
-			rfcomm_session_put(s);
 			continue;
 		}
 
@@ -1976,8 +1952,6 @@ static void rfcomm_process_sessions(void)
 			continue;
 		}
 
-		rfcomm_session_hold(s);
-
 		switch (s->state) {
 		case BT_BOUND:
 			s = rfcomm_check_connection(s);
@@ -1990,8 +1964,6 @@ static void rfcomm_process_sessions(void)
 
 		if (s)
 			rfcomm_process_dlcs(s);
-
-		rfcomm_session_put(s);
 	}
 
 	rfcomm_unlock();
@@ -2041,7 +2013,6 @@ static int rfcomm_add_listener(bdaddr_t *ba)
 	if (!s)
 		goto failed;
 
-	rfcomm_session_hold(s);
 	return 0;
 failed:
 	sock_release(sock);
@@ -2099,8 +2070,6 @@ static void rfcomm_security_cfm(struct hci_conn *conn, u8 status, u8 encrypt)
 	if (!s)
 		return;
 
-	rfcomm_session_hold(s);
-
 	list_for_each_safe(p, n, &s->dlcs) {
 		d = list_entry(p, struct rfcomm_dlc, list);
 
@@ -2132,8 +2101,6 @@ static void rfcomm_security_cfm(struct hci_conn *conn, u8 status, u8 encrypt)
 			set_bit(RFCOMM_AUTH_REJECT, &d->flags);
 	}
 
-	rfcomm_session_put(s);
-
 	rfcomm_schedule();
 }
 
-- 
1.7.10.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/6] Bluetooth: Remove redundant call to rfcomm_send_disc
  2013-02-25 16:38 [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt dean_jenkins
                   ` (3 preceding siblings ...)
  2013-02-25 16:38 ` [PATCH 4/6] Bluetooth: Remove RFCOMM session refcnt dean_jenkins
@ 2013-02-25 16:38 ` dean_jenkins
  2013-02-25 16:38 ` [PATCH 6/6] Bluetooth: Remove redundant RFCOMM BT_CLOSED settings dean_jenkins
  2013-02-25 18:32 ` [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt Marcel Holtmann
  6 siblings, 0 replies; 12+ messages in thread
From: dean_jenkins @ 2013-02-25 16:38 UTC (permalink / raw)
  To: linux-bluetooth; +Cc: marcel, gustavo

From: Dean Jenkins <Dean_Jenkins@mentor.com>

In rfcomm_session_del() remove the redundant call to
rfcomm_send_disc() because it is not possible for the
session to be in BT_CONNECTED state during deletion
of the session.

Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
---
 net/bluetooth/rfcomm/core.c |    3 ---
 1 file changed, 3 deletions(-)

diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index ecef9bc..3bdaf2e 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -627,9 +627,6 @@ static struct rfcomm_session *rfcomm_session_del(struct rfcomm_session *s)
 
 	list_del(&s->list);
 
-	if (state == BT_CONNECTED)
-		rfcomm_send_disc(s, 0);
-
 	rfcomm_session_clear_timer(s);
 	sock_release(s->sock);
 	kfree(s);
-- 
1.7.10.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/6] Bluetooth: Remove redundant RFCOMM BT_CLOSED settings
  2013-02-25 16:38 [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt dean_jenkins
                   ` (4 preceding siblings ...)
  2013-02-25 16:38 ` [PATCH 5/6] Bluetooth: Remove redundant call to rfcomm_send_disc dean_jenkins
@ 2013-02-25 16:38 ` dean_jenkins
  2013-02-25 18:32 ` [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt Marcel Holtmann
  6 siblings, 0 replies; 12+ messages in thread
From: dean_jenkins @ 2013-02-25 16:38 UTC (permalink / raw)
  To: linux-bluetooth; +Cc: marcel, gustavo

From: Dean Jenkins <Dean_Jenkins@mentor.com>

rfcomm_session_close() sets the RFCOMM session state to BT_CLOSED.
However, in multiple places immediately before the function is
called, the RFCOMM session is set to BT_CLOSED. Therefore,
remove these unnecessary state settings.

Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
---
 net/bluetooth/rfcomm/core.c |    7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index 3bdaf2e..2e82b60 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -659,10 +659,10 @@ static struct rfcomm_session *rfcomm_session_close(struct rfcomm_session *s,
 	struct rfcomm_dlc *d;
 	struct list_head *p, *n;
 
-	BT_DBG("session %p state %ld err %d", s, s->state, err);
-
 	s->state = BT_CLOSED;
 
+	BT_DBG("session %p state %ld err %d", s, s->state, err);
+
 	/* Close all dlcs */
 	list_for_each_safe(p, n, &s->dlcs) {
 		d = list_entry(p, struct rfcomm_dlc, list);
@@ -1188,7 +1188,6 @@ static struct rfcomm_session *rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
 		else
 			err = ECONNRESET;
 
-		s->state = BT_CLOSED;
 		s = rfcomm_session_close(s, err);
 	}
 	return s;
@@ -1224,7 +1223,6 @@ static struct rfcomm_session *rfcomm_recv_disc(struct rfcomm_session *s,
 		else
 			err = ECONNRESET;
 
-		s->state = BT_CLOSED;
 		s = rfcomm_session_close(s, err);
 	}
 	return s;
@@ -1921,7 +1919,6 @@ static struct rfcomm_session *rfcomm_check_connection(struct rfcomm_session *s)
 		break;
 
 	case BT_CLOSED:
-		s->state = BT_CLOSED;
 		s = rfcomm_session_close(s, sk->sk_err);
 		break;
 	}
-- 
1.7.10.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt
  2013-02-25 16:38 [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt dean_jenkins
                   ` (5 preceding siblings ...)
  2013-02-25 16:38 ` [PATCH 6/6] Bluetooth: Remove redundant RFCOMM BT_CLOSED settings dean_jenkins
@ 2013-02-25 18:32 ` Marcel Holtmann
  2013-02-26 18:13   ` Dean Jenkins
  6 siblings, 1 reply; 12+ messages in thread
From: Marcel Holtmann @ 2013-02-25 18:32 UTC (permalink / raw)
  To: dean_jenkins; +Cc: linux-bluetooth, gustavo

Hi Dean,

> This patchset consists of the following 6 patches that reworks the RFCOMM
> session refcnt by adding handling for a NULL RFCOMM session pointer when the
> session has been deleted followed by removing the now redundant refcnt
> mechanism:
> 
> ----------------------------------------------------------------
> Dean Jenkins (6):
>       Bluetooth: Avoid rfcomm_session_timeout using freed session
>       Bluetooth: Check rfcomm session and DLC exists on socket close
>       Bluetooth: Return RFCOMM session ptrs to avoid freed session
>       Bluetooth: Remove RFCOMM session refcnt
>       Bluetooth: Remove redundant call to rfcomm_send_disc
>       Bluetooth: Remove redundant RFCOMM BT_CLOSED settings
> 
>  include/net/bluetooth/rfcomm.h |    6 ----
>  net/bluetooth/rfcomm/core.c    |  163 ++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------------------------------
>  2 files changed, 76 insertions(+), 93 deletions(-)
> 
> The motivation for the change was RFCOMM kernel crashes in a 2.6.34 based ARM
> SMP kernel. It is noted that RFCOMM in kernel 3.7 still contains the RFCOMM
> session refcnt so the patches have been forward ported. Crashes were due to
> malfunctions of the RFCOMM session refcnt such as premature deletion of the
> session and double deletion of the session. Some crashes were caused by
> the interaction of 2 kernel threads under very high processor loading.
> 
> Note that the refcnt is incorrectly initialised, it should be set to 1 before
> allowing the session structure to be used, the old code used a value of 0 and
> this contributes to a weak implementation of the refcnt.
> 
> In kernels later than 2.6.34, the threading model for RFCOMM was modified to
> use work queues and this reduced the probability of a RFCOMM crash. This
> helps to explain why the reports of RFCOMM crashes went away. However, the
> implementation of the RFCOMM session refcnt is weak and is in fact logically
> incorrect by using the RFCOMM data channel "initiator" flag. The control
> channel disconnection procedures do not care about how the control channel
> was established eg. by host or remote device request.
> 
> The patches also make it clear when the RFCOMM session has been deleted by
> using a NULL pointer. The old code has a possibility to access freed RFCOMM
> session structures because there are multiple local copies of the pointer. The
> patches ensure that local copies of the pointer are set to NULL when the
> session has been deleted so avoiding any possibility of accessing freed memory.
> 
> Therefore, the patchset cleans up the disconnection of the RFCOMM session
> control channel and now the code is deterministic and is understandable by
> code review.
> 
> These patches were lightly tested against kernel 3.7.3 on a x86 PC. The design
> of the patches were proven on a commercial project using a modified embedded ARM
> SMP kernel 2.6.34.13. All previously observed RFCOMM control channel
> disconnection kernel crashes on ARM 2.6.34.13 went away with the patches
> applied.

I looked through the patch set and could not spot anything obvious
wrong. Thanks for looking into this. The RFCOMM reference counting has
been always giving me a headache.

Great job on the cover page and the commit messages. It was a pleasure
to read them and then go through the code. Makes review a lot easier.

On my account, lets get these patches in and see how they are doing.
Once they are in bluetooth-next they get a bit wider expose on upstream
kernels.

Acked-by: Marcel Holtmann <marcel@holtmann.org>

Regards

Marcel



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/6] Bluetooth: Remove RFCOMM session refcnt
  2013-02-25 16:38 ` [PATCH 4/6] Bluetooth: Remove RFCOMM session refcnt dean_jenkins
@ 2013-02-25 23:08   ` David Herrmann
  0 siblings, 0 replies; 12+ messages in thread
From: David Herrmann @ 2013-02-25 23:08 UTC (permalink / raw)
  To: dean_jenkins; +Cc: linux-bluetooth, marcel, gustavo

Hi

On Mon, Feb 25, 2013 at 5:38 PM,  <dean_jenkins@mentor.com> wrote:
> From: Dean Jenkins <Dean_Jenkins@mentor.com>
>
> Previous commits have improved the handling of the RFCOMM session
> timer and the RFCOMM session pointers such that freed RFCOMM
> session structures should no longer be erroneously accessed. The
> RFCOMM session refcnt now has no purpose and will be deleted by
> this commit.
>
> Note that the RFCOMM session is now deleted as soon as the
> RFCOMM control channel link is no longer required. This makes the
> lifetime of the RFCOMM session deterministic and absolute.
> Previously with the refcnt, there was uncertainty about when
> the session structure would be deleted because the relative
> refcnt prevented the session structure from being deleted at will.
>
> It was noted that the refcnt could malfunction under very heavy
> real-time processor loading in embedded SMP environments. This
> could cause premature RFCOMM session deletion or double session
> deletion that could result in kernel crashes. Removal of the
> refcnt prevents this issue.
>
> There are 4 connection / disconnection RFCOMM session scenarios:
> host initiated control link ---> host disconnected control link
> host initiated ctrl link ---> remote device disconnected ctrl link
> remote device initiated ctrl link ---> host disconnected ctrl link
> remote device initiated ctrl link ---> remote device disc'ed ctrl link
>
> The control channel connection procedures are independent of the
> disconnection procedures. Strangely, the RFCOMM session refcnt was
> applying special treatment so erroneously combining connection and
> disconnection events. This commit fixes this issue by removing
> some session code that used the "initiator" member of the session
> structure that was intended for use with the data channels.

I tried to do a similar cleanup for the HIDP code (pending on the ML).
Did you check whether the "device_find_child()" and "device_move()"
calls in hci_sysfs.c hci_conn_del_sysfs() are still needed with this
cleanup? The HIDP cleanup doesn't need it, anymore, and if the rfcomm
code got rid of it, too, then we can finally drop that code.

I really think the *_hold() calls in the Bluetooth-core are flawed. We
should try to fix it to split ref-counting and life-time tracking.
Ref-counting for objects can be non-deterministic and this is totally
ok. But life-time tracking must be deterministic. That is, the
device_del() calls must be predictable so we can tell when objects get
deleted. Whether they get freed is minor, this doesn't harm anybody.

Thanks for the nice cleanup.
David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt
  2013-02-25 18:32 ` [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt Marcel Holtmann
@ 2013-02-26 18:13   ` Dean Jenkins
  0 siblings, 0 replies; 12+ messages in thread
From: Dean Jenkins @ 2013-02-26 18:13 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: linux-bluetooth, gustavo

Hi Marcel,

On 25/02/13 18:32, Marcel Holtmann wrote:
> Hi Dean,
>
>> This patchset consists of the following 6 patches that reworks the RFCOMM
>> session refcnt by adding handling for a NULL RFCOMM session pointer when the
>> session has been deleted followed by removing the now redundant refcnt
>> mechanism:
>>
>> ----------------------------------------------------------------
>> Dean Jenkins (6):
>>        Bluetooth: Avoid rfcomm_session_timeout using freed session
>>        Bluetooth: Check rfcomm session and DLC exists on socket close
>>        Bluetooth: Return RFCOMM session ptrs to avoid freed session
>>        Bluetooth: Remove RFCOMM session refcnt
>>        Bluetooth: Remove redundant call to rfcomm_send_disc
>>        Bluetooth: Remove redundant RFCOMM BT_CLOSED settings
>>
>>   include/net/bluetooth/rfcomm.h |    6 ----
>>   net/bluetooth/rfcomm/core.c    |  163 ++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------------------------------
>>   2 files changed, 76 insertions(+), 93 deletions(-)
>>
>> The motivation for the change was RFCOMM kernel crashes in a 2.6.34 based ARM
>> SMP kernel. It is noted that RFCOMM in kernel 3.7 still contains the RFCOMM
>> session refcnt so the patches have been forward ported. Crashes were due to
>> malfunctions of the RFCOMM session refcnt such as premature deletion of the
>> session and double deletion of the session. Some crashes were caused by
>> the interaction of 2 kernel threads under very high processor loading.
>>
>> Note that the refcnt is incorrectly initialised, it should be set to 1 before
>> allowing the session structure to be used, the old code used a value of 0 and
>> this contributes to a weak implementation of the refcnt.
>>
>> In kernels later than 2.6.34, the threading model for RFCOMM was modified to
>> use work queues and this reduced the probability of a RFCOMM crash. This
>> helps to explain why the reports of RFCOMM crashes went away. However, the
>> implementation of the RFCOMM session refcnt is weak and is in fact logically
>> incorrect by using the RFCOMM data channel "initiator" flag. The control
>> channel disconnection procedures do not care about how the control channel
>> was established eg. by host or remote device request.
>>
>> The patches also make it clear when the RFCOMM session has been deleted by
>> using a NULL pointer. The old code has a possibility to access freed RFCOMM
>> session structures because there are multiple local copies of the pointer. The
>> patches ensure that local copies of the pointer are set to NULL when the
>> session has been deleted so avoiding any possibility of accessing freed memory.
>>
>> Therefore, the patchset cleans up the disconnection of the RFCOMM session
>> control channel and now the code is deterministic and is understandable by
>> code review.
>>
>> These patches were lightly tested against kernel 3.7.3 on a x86 PC. The design
>> of the patches were proven on a commercial project using a modified embedded ARM
>> SMP kernel 2.6.34.13. All previously observed RFCOMM control channel
>> disconnection kernel crashes on ARM 2.6.34.13 went away with the patches
>> applied.
> I looked through the patch set and could not spot anything obvious
> wrong. Thanks for looking into this. The RFCOMM reference counting has
> been always giving me a headache.
>
> Great job on the cover page and the commit messages. It was a pleasure
> to read them and then go through the code. Makes review a lot easier.
>
> On my account, lets get these patches in and see how they are doing.
> Once they are in bluetooth-next they get a bit wider expose on upstream
> kernels.
>
> Acked-by: Marcel Holtmann <marcel@holtmann.org>
>
> Regards
>
> Marcel
>
>
Many thanks for the encouraging feedback.

I wait with eager anticipation to see my patches in kernel.org and to 
see what feedback comes from upstream.

Thanks,

Regards,
Dean

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/6] Bluetooth: Check rfcomm session and DLC exists on socket close
  2013-02-25 16:38 ` [PATCH 2/6] Bluetooth: Check rfcomm session and DLC exists on socket close dean_jenkins
@ 2013-02-26 19:12   ` Gustavo Padovan
  0 siblings, 0 replies; 12+ messages in thread
From: Gustavo Padovan @ 2013-02-26 19:12 UTC (permalink / raw)
  To: dean_jenkins; +Cc: linux-bluetooth, marcel

Hi Dean,

* dean_jenkins@mentor.com <dean_jenkins@mentor.com> [2013-02-25 16:38:33 +0000]:

> From: Dean Jenkins <Dean_Jenkins@mentor.com>
> 
> A race condition exists between near simultaneous asynchronous
> DLC data channel disconnection requests from the host and remote device.
> This causes the socket layer to request a socket shutdown at the same
> time the rfcomm core is processing the disconnect request from the remote
> device.
> 
> The socket layer retains a copy of a struct rfcomm_dlc d pointer.
> The d pointer refers to a copy of a struct rfcomm_session.
> When the socket layer thread performs a socket shutdown, the thread
> may wait on a rfcomm lock in rfcomm_dlc_close(). This means that
> whilst the thread waits, the rfcomm_session and/or rfcomm_dlc structures
> pointed to by d maybe freed due to rfcomm core handling. Consequently,
> when the rfcomm lock becomes available and the thread runs, a
> malfunction could occur as a freed rfcomm_session structure and/or a
> freed rfcomm_dlc structure will be erroneously accessed.
> 
> Therefore, after the rfcomm lock is acquired, check that the struct
> rfcomm_session is still valid by searching the rfcomm session list.
> If the session is valid then validate the d pointer by searching the
> rfcomm session list of active DLCs for the rfcomm_dlc structure
> pointed by d.
> 
> Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
> ---
>  net/bluetooth/rfcomm/core.c |   26 ++++++++++++++++++++++++--
>  1 file changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> index 8780e67..af0c26d 100644
> --- a/net/bluetooth/rfcomm/core.c
> +++ b/net/bluetooth/rfcomm/core.c
> @@ -493,12 +493,34 @@ static int __rfcomm_dlc_close(struct rfcomm_dlc *d, int err)
>  
>  int rfcomm_dlc_close(struct rfcomm_dlc *d, int err)
>  {
> -	int r;
> +	int r = 0;
> +	struct rfcomm_dlc *d_list;
> +	struct rfcomm_session *s, *s_list;
> +
> +	BT_DBG("dlc %p state %ld dlci %d err %d", d, d->state, d->dlci, err);
>  
>  	rfcomm_lock();
>  
> -	r = __rfcomm_dlc_close(d, err);
> +	s = d->session;
> +	if (!s)
> +		goto no_session;
> +
> +	/* after waiting on the mutex check the session still exists */
> +	list_for_each_entry(s_list, &session_list, list) {
> +		if (s_list == s) {
> +			/* after waiting on the mutex, */
> +			/* check the dlc still exists */

Just a very small issue, this comment here is not following our coding style.
If you want a multiple line comment do something like this:

			/* after waiting on the mutex,
			 * check the dlc still exists
			 */

But I will just recommend you merge this comment with the one before the first
list_for_each_entry. please send an updated version of this patch so I can go
ahead and apply it.

	Gustavo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/6] Bluetooth: Return RFCOMM session ptrs to avoid freed session
  2013-02-25 16:38 ` [PATCH 3/6] Bluetooth: Return RFCOMM session ptrs to avoid freed session dean_jenkins
@ 2013-02-26 19:21   ` Gustavo Padovan
  0 siblings, 0 replies; 12+ messages in thread
From: Gustavo Padovan @ 2013-02-26 19:21 UTC (permalink / raw)
  To: dean_jenkins; +Cc: linux-bluetooth, marcel

Hi Dean,

* dean_jenkins@mentor.com <dean_jenkins@mentor.com> [2013-02-25 16:38:34 +0000]:

> From: Dean Jenkins <Dean_Jenkins@mentor.com>
> 
> Unfortunately, the design retains local copies of the s RFCOMM
> session pointer in various code blocks and this invites the erroneous
> access to a freed RFCOMM session structure.
> 
> Therefore, return the RFCOMM session pointer back up the call stack
> to avoid accessing a freed RFCOMM session structure. When the RFCOMM
> session is deleted, NULL is passed up the call stack.
> 
> If active DLCs exist when the rfcomm session is terminating,
> avoid a memory leak of rfcomm_dlc structures by ensuring that
> rfcomm_session_close() is used instead of rfcomm_session_del().
> 
> Signed-off-by: Dean Jenkins <Dean_Jenkins@mentor.com>
> ---
>  include/net/bluetooth/rfcomm.h |    3 +-
>  net/bluetooth/rfcomm/core.c    |  106 +++++++++++++++++++++-------------------
>  2 files changed, 58 insertions(+), 51 deletions(-)
> 
> diff --git a/include/net/bluetooth/rfcomm.h b/include/net/bluetooth/rfcomm.h
> index e2e3eca..a4e38ea 100644
> --- a/include/net/bluetooth/rfcomm.h
> +++ b/include/net/bluetooth/rfcomm.h
> @@ -278,7 +278,8 @@ void   rfcomm_session_getaddr(struct rfcomm_session *s, bdaddr_t *src,
>  
>  static inline void rfcomm_session_hold(struct rfcomm_session *s)
>  {
> -	atomic_inc(&s->refcnt);
> +	if (s)
> +		atomic_inc(&s->refcnt);
>  }
>  
>  /* ---- RFCOMM sockets ---- */
> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> index af0c26d..60d2f1a 100644
> --- a/net/bluetooth/rfcomm/core.c
> +++ b/net/bluetooth/rfcomm/core.c
> @@ -69,7 +69,7 @@ static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
>  							u8 sec_level,
>  							int *err);
>  static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst);
> -static void rfcomm_session_del(struct rfcomm_session *s);
> +static struct rfcomm_session *rfcomm_session_del(struct rfcomm_session *s);
>  
>  /* ---- RFCOMM frame parsing macros ---- */
>  #define __get_dlci(b)     ((b & 0xfc) >> 2)
> @@ -108,10 +108,12 @@ static void rfcomm_schedule(void)
>  	wake_up_process(rfcomm_thread);
>  }
>  
> -static void rfcomm_session_put(struct rfcomm_session *s)
> +static struct rfcomm_session *rfcomm_session_put(struct rfcomm_session *s)
>  {
> -	if (atomic_dec_and_test(&s->refcnt))
> -		rfcomm_session_del(s);
> +	if (s && atomic_dec_and_test(&s->refcnt))
> +		s = rfcomm_session_del(s);
> +
> +	return s;
>  }
>  
>  /* ---- RFCOMM FCS computation ---- */
> @@ -631,7 +633,7 @@ static struct rfcomm_session *rfcomm_session_add(struct socket *sock, int state)
>  	return s;
>  }
>  
> -static void rfcomm_session_del(struct rfcomm_session *s)
> +static struct rfcomm_session *rfcomm_session_del(struct rfcomm_session *s)
>  {
>  	int state = s->state;
>  
> @@ -648,6 +650,8 @@ static void rfcomm_session_del(struct rfcomm_session *s)
>  
>  	if (state != BT_LISTEN)
>  		module_put(THIS_MODULE);
> +
> +	return NULL;
>  }
>  
>  static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst)
> @@ -666,7 +670,8 @@ static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst)
>  	return NULL;
>  }
>  
> -static void rfcomm_session_close(struct rfcomm_session *s, int err)
> +static struct rfcomm_session *rfcomm_session_close(struct rfcomm_session *s,
> +							int err)

wrong identation here, you have to align with the opening parenthesis:

static struct rfcomm_session *rfcomm_session_close(struct rfcomm_session *s,
						   int err)



>  {
>  	struct rfcomm_dlc *d;
>  	struct list_head *p, *n;
> @@ -685,7 +690,7 @@ static void rfcomm_session_close(struct rfcomm_session *s, int err)
>  	}
>  
>  	rfcomm_session_clear_timer(s);
> -	rfcomm_session_put(s);
> +	return rfcomm_session_put(s);
>  }
>  
>  static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
> @@ -737,8 +742,7 @@ static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
>  	if (*err == 0 || *err == -EINPROGRESS)
>  		return s;
>  
> -	rfcomm_session_del(s);
> -	return NULL;
> +	return rfcomm_session_del(s);
>  
>  failed:
>  	sock_release(sock);
> @@ -1127,7 +1131,7 @@ static void rfcomm_make_uih(struct sk_buff *skb, u8 addr)
>  }
>  
>  /* ---- RFCOMM frame reception ---- */
> -static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
> +static struct rfcomm_session *rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
>  {
>  	BT_DBG("session %p state %ld dlci %d", s, s->state, dlci);
>  
> @@ -1136,7 +1140,7 @@ static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
>  		struct rfcomm_dlc *d = rfcomm_dlc_get(s, dlci);
>  		if (!d) {
>  			rfcomm_send_dm(s, dlci);
> -			return 0;
> +			return s;
>  		}
>  
>  		switch (d->state) {
> @@ -1172,25 +1176,14 @@ static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
>  			break;
>  
>  		case BT_DISCONN:
> -			/* rfcomm_session_put is called later so don't do
> -			 * anything here otherwise we will mess up the session
> -			 * reference counter:
> -			 *
> -			 * (a) when we are the initiator dlc_unlink will drive
> -			 * the reference counter to 0 (there is no initial put
> -			 * after session_add)
> -			 *
> -			 * (b) when we are not the initiator rfcomm_rx_process
> -			 * will explicitly call put to balance the initial hold
> -			 * done after session add.
> -			 */
> +			s = rfcomm_session_close(s, ECONNRESET);
>  			break;
>  		}
>  	}
> -	return 0;
> +	return s;
>  }
>  
> -static int rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
> +static struct rfcomm_session *rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
>  {
>  	int err = 0;
>  
> @@ -1215,12 +1208,13 @@ static int rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
>  			err = ECONNRESET;
>  
>  		s->state = BT_CLOSED;
> -		rfcomm_session_close(s, err);
> +		s = rfcomm_session_close(s, err);
>  	}
> -	return 0;
> +	return s;
>  }
>  
> -static int rfcomm_recv_disc(struct rfcomm_session *s, u8 dlci)
> +static struct rfcomm_session *rfcomm_recv_disc(struct rfcomm_session *s,
> +						u8 dlci)
>  {
>  	int err = 0;
>  
> @@ -1250,10 +1244,9 @@ static int rfcomm_recv_disc(struct rfcomm_session *s, u8 dlci)
>  			err = ECONNRESET;
>  
>  		s->state = BT_CLOSED;
> -		rfcomm_session_close(s, err);
> +		s = rfcomm_session_close(s, err);
>  	}
> -
> -	return 0;
> +	return s;
>  }
>  
>  void rfcomm_dlc_accept(struct rfcomm_dlc *d)
> @@ -1674,11 +1667,18 @@ drop:
>  	return 0;
>  }
>  
> -static int rfcomm_recv_frame(struct rfcomm_session *s, struct sk_buff *skb)
> +static struct rfcomm_session *rfcomm_recv_frame(struct rfcomm_session *s,
> +							struct sk_buff *skb)

Same here.

Please fix this. You can collect Marcel's Ack and add them to your patches and
resend and updated version with these fixes in it.

	Gustavo

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-02-26 19:21 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-25 16:38 [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt dean_jenkins
2013-02-25 16:38 ` [PATCH 1/6] Bluetooth: Avoid rfcomm_session_timeout using freed session dean_jenkins
2013-02-25 16:38 ` [PATCH 2/6] Bluetooth: Check rfcomm session and DLC exists on socket close dean_jenkins
2013-02-26 19:12   ` Gustavo Padovan
2013-02-25 16:38 ` [PATCH 3/6] Bluetooth: Return RFCOMM session ptrs to avoid freed session dean_jenkins
2013-02-26 19:21   ` Gustavo Padovan
2013-02-25 16:38 ` [PATCH 4/6] Bluetooth: Remove RFCOMM session refcnt dean_jenkins
2013-02-25 23:08   ` David Herrmann
2013-02-25 16:38 ` [PATCH 5/6] Bluetooth: Remove redundant call to rfcomm_send_disc dean_jenkins
2013-02-25 16:38 ` [PATCH 6/6] Bluetooth: Remove redundant RFCOMM BT_CLOSED settings dean_jenkins
2013-02-25 18:32 ` [PATCH 0/6] Bluetooth: Rework the RFCOMM session refcnt Marcel Holtmann
2013-02-26 18:13   ` Dean Jenkins

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.