All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] Bluetooth: Details of rfcomm fixes - remove session refcnt
@ 2012-08-11 18:47 Dean Jenkins
  2012-08-11 18:47 ` [PATCH 1/4] Bluetooth: Remove rfcomm " Dean Jenkins
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Dean Jenkins @ 2012-08-11 18:47 UTC (permalink / raw)
  To: linux-bluetooth

This patchset contains the following patches:

----------------------------------------------------------------
Dean Jenkins (4):
      Bluetooth: Remove rfcomm session refcnt
      Bluetooth: Return rfcomm session pointers to avoid freed session
      Bluetooth: Avoid rfcomm_session_timeout using freed pointer
      Bluetooth: On socket shutdown check rfcomm session and DLC exists

 include/net/bluetooth/rfcomm.h |    6 ----
 net/bluetooth/rfcomm/core.c    |  163 ++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------------------------------------------
 2 files changed, 82 insertions(+), 87 deletions(-)

These rfcomm fixes includes a fix for a fundamental issue with the rfcomm
session refcnt that causes kernel crashes in high processor loaded
environments such as multi-core embedded ARM systems.

The concept of a refcnt is to increase the counter when something becomes
dependent on a resource. The counter is decreased when things no longer are
dependent on that resource. When the counter reaches zero, nothing is
dependent on the resource so the resource can be freed or released. The refcnt
should start with a value of 1 because something must be initially dependent on
the resource to prevent the resource from being immediately freed or released.
This means there must be 1 extra decrement operation when compared to the
number of increment operations (excluding the operation to set the initial
value of 1).

What does the rfcomm session refcnt protect ? This is unclear so here is
analysis of the locations of the calls to rfcomm_session_hold() that
increments the refcnt:

rfcomm_session_set_timer() - timer to force session disconnection when no DLCs
rfcomm_dlc_link() - binding DLC data channel to the session
rfcomm_session_close() - close the session but not delete session
rfcomm_accept_connection() - initialise refcnt to 1 for incoming connection
rfcomm_process_sessions() - protect processing of the session
rfcomm_add_listener() - initialise refcnt to 1 for listening session
rfcomm_security_cfm() - protect processing of the session


Lets go through the analysis of the rfcomm_session_hold() calls:

rfcomm_session_set_timer():
The refcnt is incremented when the rfcomm session timer is enabled. The timer
routine rfcomm_session_timeout() uses the session structure so the session
must not be deleted whilst the timer is enabled. This sounds like a valid
resource dependency, however, the timer is disabled each time the session is
closed or deleted. Therefore, protection of the session is unnecessary against
the timer expiring after the session has been deleted. It is noted that the
deletion of the timer is not SMP safe and a fix is supplied in this patchset.

rfcomm_dlc_link():
Each time a rfcomm DLC data connection is established, the refcnt is
incremented. This sounds like a valid resource dependency, however, the
rfcomm session is already dependent on DLC data channel connections because
there is a rfcomm_dlc list. In normal protocol operations, the rfcomm session is
deleted after the rfcomm control channel has been disconnected. The rfcomm
control channel is disconnected after all the rfcomm DLC data channels are
disconnected. Therefore, having a refcnt to protect against deletion of the
session whilst DLC data connections exist is unnecessary because a mechanism is
already in place.

rfcomm_session_close():
The call to rfcomm_session_hold() is matched by a call to rfcomm_session_put()
and protects the closing of DLC data connections. This looks like a
valid resource dependency to prevent the rfcomm session from being deleted
whilst closing the DLC data connections. However, rfcomm_session_close() is
called after the rfcomm control channel has disconnected or after the socket
layer has gone into a closed state (abnormal protocol handling). Therefore,
it seems that there is no risk of rfcomm_session_close() being called during
an active rfcomm control connection. This means protection of the rfcomm
session during the closing of any "orphaned" DLC data connection is
unnecessary.

rfcomm_accept_connection():
Each time an incoming rfcomm control channel connection request is detected, a
rfcomm session is created. The refcnt is set to 1 (via increment). This is
correct, however, it is not safe from thread context switching, see
below refcnt failure of rfcomm_security_cfm(). The refcnt should be set to 1
before the session is added to the session list to prevent premature deletion
of the rfcomm session due to other threads accessing the session list before
the initial increment has taken place. This is a race condition.

rfcomm_process_sessions():
The call to rfcomm_session_hold() is matched by a call to rfcomm_session_put()
and protects the calls to rfcomm_check_connection(), rfcomm_process_rx() and
rfcomm_process_dlcs(). This looks like a valid resource dependency. This is
hard to analyse. One observation is that the session cannot be deleted until
the rfcomm_session_put() is called, however, this assumes there is no
imbalance in the refcnt handling. Remember that a call to rfcomm_session_put()
will cause the session to be deleted when the refcnt hits zero. This makes the
code non-deterministic when considering abnormal protocol handling. In other
words, the session is deleted based on relative changes to the refcnt instead
of an absolute decisive reaction to an event to delete the session by the
session state machine.

rfcomm_add_listener():
A listening session is initially created. The refcnt is set to 1 (via
increment). This is correct, however, it is not safe from thread context
switching. The refcnt should be set to 1 before the session is added to the
session list to prevent premature deletion of the rfcomm session due to other
threads accessing the session list before the initial increment has taken place.
This is a race condition.

rfcomm_security_cfm():
The call to rfcomm_session_hold() is matched by a call to rfcomm_session_put()
and protects the operation of the list_for_each_safe(p, n, &s->dlcs) loop.
The protection is unnecessary because the loop does not modify the rfcomm
session structure. By experimentation, removal of the hold and put has no
impact and in fact prevents the scheduling while atomic warnings, see below.


Conclusions to what the refcnt is protecting
--------------------------------------------

Apart from doubts over the refcnt validity in rfcomm_process_sessions(), most of
the refcnt protection is duplicating existing protection mechanisms or is
setting the refcnt to an initial value of 1.

There is a fundamental flaw in the refcnt implementation. You will note that
the refcnt is not initially being set to 1 for host originated rfcomm control
channel connections. This means that there is an imbalance in the refcnt
treatment of host initiated and remote initiated rfcomm control channel
connections. It would appear that some workarounds in the disconnection code
were attempted (see below) that used the DLC initiator flag. These workarounds
are incorrect as the root cause is that the rfcomm session structure is not
added to the session list with a refcnt of 1. In other words, all rfcomm session
allocations should start with an initial refcnt of 1 and not 0. See above
description of how a refcnt should work.

IMHO the rfcomm session refcnt is adding extra code that is duplicating
existing protection mechanisms of the rfcomm core code. I believe the refcnt is
not adding anything useful to the solution and obscures the act of deleting the
rfcomm session structure. Is the refcnt worth fixing ? I think not, see
below for a solution as implemented by the patchset.

It could be argued that the refcnt and the rfcomm session state machine are
acting as a "belt and braces" solution. IMHO, the refcnt is therefore redundant
as shown by the solution in the patchset.

Note existing protection in rfcomm core includes the mutex rfcomm_lock() and
rfcomm_unlock() functions. This serialises the thread processing so only 1
rfcomm thread can run at any one time. The thread uses in a process context.
Note in kernel 2.6.34 this protection does not prevent rfcomm_security_cfm()
 unning (uses interrupt context) however this was changed in later kernels.


Patchset solution
-----------------

The "fix" for the rfcomm session refcnt is to remove the refcnt for the
following reasons.

1. The lifetime of a rfcomm session refcnt does not last outside the lifetime
of its associated rfcomm session structure. This means the refcnt becomes
invalid as soon as the session is deleted. Consequently, the rfcomm session
refcnt does not manage the rfcomm session pointer after the session has been
deleted. This allows local copies of the rfcomm session s pointer to be reused
after the rfcomm session structure has been freed. In particular, additional
accesses to rfcomm_session_put() and rfcomm_session_hold() after the session has
been deleted will be using freed or reallocated memory. This can lead to kernel
crashes or memory corruption. Test debug added to the kernel confirmed that a
freed rfcomm session was sometimes being reused in rfcomm_session_put() and
rfcomm_session_hold() function calls. This may result in a kernel crash (shown
below) or memory corruption.

ARM Project Kernel 2.6.34 unpatched kernel crash:

[  695.591056] Unable to handle kernel paging request at virtual address
00200200
[  695.791122] [<802886c4>] (rfcomm_session_del+0x18/0x70) from [<8028a560>]
(rfcomm_run+0xf10/0x12d4)
[  695.800260] [<8028a560>] (rfcomm_run+0xf10/0x12d4) from [<80065e70>]
(kthread+0x7c/0x84)
[  695.808454] [<80065e70>] (kthread+0x7c/0x84) from [<8002db04>]
(kernel_thread_exit+0x0/0x8)

2. The initial rfcomm session refcnt is 0 or 1 depending of which rfcomm peer
initated the rfcomm session. This is incorrect because the rfcomm session
connection and disconnection events are independent of each other. In
other words, the refcnt value must reach the same value to allow a successful
disconnection despite which peer initiated the rfcomm session. This is a
fundamental flaw in the refcnt implementation.

3. The rfcomm session refcnt inhibits the existing rfcomm session state machine
from deleting the rfcomm session structure immediately the state machine knows
that the session is finished. I suspect this was to prevent the session timer
expiring after the session was finished. However, it is noted that the session
timer is deleted when the session is closed so the refcnt is unnecessary for
protection. One of the patches fixes the session timer deletion for SMP systems.

4. "Broken workaround" code becomes obvious when the rfcomm session refcnt
is removed. It can be seen that multiple attempts to fix the refcnt have been
tried but these attempts failed to fix the the refcnt initialisation issue so
are flawed. Therefore remove the workaround code to allow a proper fix.

5. High processor loading causes the kernel thread order to change exposing
the rfcomm session refcnt to malfunction eg. premature session deletion or
double session deletion. In particular, rfcomm_security_cfm() causes refcnt
failure by premature session deletion when the refcnt was initially zero.
This issue is improved in kernels later than 2.6.34 as rfcomm_security_cfm()
runs in a process context rather than in an interrupt context. This caused
scheduling while atomic warnings and kernel crashes see below.

ARM Project Kernel 2.6.34 unpatched kernel crash due to premature rfcomm
session deletion in rfcomm_security_cfm() because refcnt was initially zero so
the first pair of rfcomm_session_hold() and rfcomm_session_put() calls in
rfcomm_security_cfm() cause the session to be deleted early:

[  165.033679] BUG: scheduling while atomic: uart/0/657/0x00000205
[  165.070813] [<80032754>] (unwind_backtrace+0x0/0xec) from [<802ab928>]
(schedule+0x80/0x610)
[  165.080075] [<802ab928>] (schedule+0x80/0x610) from [<802106a4>]
(lock_sock_nested+0x84/0xc4)
[  165.089350] [<802106a4>] (lock_sock_nested+0x84/0xc4) from [<80283f1c>]
(l2cap_sock_shutdown+0x24/0x1e0)
[  165.099554] [<80283f1c>] (l2cap_sock_shutdown+0x24/0x1e0) from [<802840f4>]
(l2cap_sock_release+0x1c/0x64)
[  165.110542] [<802840f4>] (l2cap_sock_release+0x1c/0x64) from [<8020efbc>]
(sock_release+0x20/0xb8)
[  165.120342] [<8020efbc>] (sock_release+0x20/0xb8) from [<80288844>]
(rfcomm_session_del+0x4c/0x70)
[  165.130347] [<80288844>] (rfcomm_session_del+0x4c/0x70) from [<80288cd8>]
(rfcomm_security_cfm+0x158/0x194)
[  165.141099] [<80288cd8>] (rfcomm_security_cfm+0x158/0x194) from [<80276c64>]
(hci_event_packet+0xc18/0x2f78)
[  165.151760] [<80276c64>] (hci_event_packet+0xc18/0x2f78) from [<802737d4>]
(hci_rx_task+0xa4/0x254)
[  165.161594] [<802737d4>] (hci_rx_task+0xa4/0x254) from [<80056568>]
(tasklet_action+0xa0/0x138)
[  165.171096] [<80056568>] (tasklet_action+0xa0/0x138) from [<80056984>]
(__do_softirq+0x94/0x124)
[  165.180657] [<80056984>] (__do_softirq+0x94/0x124) from [<80056ab0>]
(do_softirq+0x44/0x50)
[  165.189662] [<80056ab0>] (do_softirq+0x44/0x50) from [<80056cec>]
(local_bh_enable_ip+0x94/0xc8)
[  165.199323] [<80056cec>] (local_bh_enable_ip+0x94/0xc8) from [<801e34fc>]
(desc_get+0x84/0xf4)
[  165.208640] [<801e34fc>] (desc_get+0x84/0xf4) from [<801e41ec>]
(prep_slave_sg+0xc4/0x23c)
[  165.217627] [<801e41ec>] (prep_slave_sg+0xc4/0x23c) from [<801c2984>]
(tx_work_func+0x90/0x1c4)
[  165.227149] [<801c2984>] (tx_work_func+0x90/0x1c4) from [<80062ab4>]
(worker_thread+0xfc/0x194)
[  165.236621] [<80062ab4>] (worker_thread+0xfc/0x194) from [<80065d08>]
(kthread+0x7c/0x84)
[  165.245567] [<80065d08>] (kthread+0x7c/0x84) from [<8002db04>]
(kernel_thread_exit+0x0/0x8)


Additional information about the patches:

1: Bluetooth: Remove rfcomm session refcnt
------------------------------------------

This patch removes the rfcomm session refcnt. It also removes suspected
workarounds as follows:

@@ -1164,18 +1147,7 @@ static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
 			break;
 
 		case BT_DISCONN:
-			/* rfcomm_session_put is called later so don't do
-			 * anything here otherwise we will mess up the session
-			 * reference counter:
-			 *
-			 * (a) when we are the initiator dlc_unlink will drive
-			 * the reference counter to 0 (there is no initial put
-			 * after session_add)
-			 *
-			 * (b) when we are not the initiator rfcomm_rx_process
-			 * will explicitly call put to balance the initial hold
-			 * done after session add.
-			 */
+			rfcomm_session_del(s);
 			break;

This workaround was incorrect because disconnection of the rfcomm session does
not depend on which peer initiated the session. The initiator flag is for the
DLC data connections and is not for the rfcomm session control channel.

The solution is to delete the rfcomm session after the DISC-UA exchange on
the control channel. This is safe because there are no resource dependencies on
the delete at this point in the rfcomm session state machine.


@@ -1875,12 +1847,8 @@ static inline void rfcomm_process_rx(struct rfcomm_session *s)
 			kfree_skb(skb);
 	}
 
-	if (sk->sk_state == BT_CLOSED) {
-		if (!s->initiator)
-			rfcomm_session_put(s);
-
+	if (sk->sk_state == BT_CLOSED)
 		rfcomm_session_close(s, sk->sk_err);
-	}
 }

This workaround was incorrect because the rfcomm session control channel is
not dependent on the DLC initiator flag. This workaround is probably attempting
to compensate for the fundamental refcnt imbalance on host initiated rfcomm
control channel connections.

The solution is to close the rfcomm session if the socket is detected as
BT_CLOSED. This occurs when L2CAP has disconnected but the rfcomm layer never
performed the disconnection procedure of the rfcomm control channel
eg. abnormal protocol handling.


2: Bluetooth: Return rfcomm session pointers to avoid freed session
-------------------------------------------------------------------

This is the major improvement that prevents the rfcomm session pointer from
being reused after the rfcomm session has been deleted. This is done by passing
the rfcomm session pointer back up the call stack so ensures that the various
local s pointers are updated in the various code blocks.

Analysis of the changes highlights a malfunction in the original code:

@@ -1842,13 +1861,15 @@ static inline void rfcomm_process_rx(struct rfcomm_session *s)
 	while ((skb = skb_dequeue(&sk->sk_receive_queue))) {
 		skb_orphan(skb);
 		if (!skb_linearize(skb))
-			rfcomm_recv_frame(s, skb);
+			s = rfcomm_recv_frame(s, skb);
 		else
 			kfree_skb(skb);
 	}
 
-	if (sk->sk_state == BT_CLOSED)
-		rfcomm_session_close(s, sk->sk_err);
+	if (s && (sk->sk_state == BT_CLOSED))
+		s = rfcomm_session_close(s, sk->sk_err);
+
+	return s;
 }

It can be seen that rfcomm_recv_frame() can result in the rfcomm session being
deleted, for example, due to a rfcomm DISC on the rfcomm control channel. This
exposes the rfcomm_session_close(s, sk->sk_err) to fail because the local s is
now pointing to freed memory or reallocated memory. The solution in the new code
is to check that the s pointer is not NULL as this NULL indicates the rfcomm
session has been deleted. The original code had no mechanism to determine
whether the rfcomm session structure had been deleted.

Therefore, a crash was possible if both a rfcomm DISC on the rfcomm control
channel is processed and the socket transitions to BT_CLOSED. This can occur
in high processor loading conditions whereby rfcomm_process_rx() becomes
pre-empted after the DISC handling and sleeps. The socket transitions to
BT_CLOSED during the sleep. When rfcomm_process_rx() awakens, it crashes because
the s pointer is now invalid when used in rfcomm_session_close().

3: Bluetooth: Avoid rfcomm_session_timeout using freed pointer
--------------------------------------------------------------

This patch ensures that the rfcomm session timer is properly deleted in SMP
systems by using timer_del_sync(). This avoids having special treatment such
as the use of a refcnt, assuming that was the intention of the refcnt in the
first place.

4: Bluetooth: On socket shutdown check rfcomm session and DLC exists
--------------------------------------------------------------------

This patch prevents old pointers from the socket layer causing rfcomm crashes
during socket shutdown triggered from userland. The old pointers attempt
to disconnect no longer existing DLC connections.

ARM Project Kernel 2.6.34 unpatched kernel crash

[  269.534202] Unable to handle kernel NULL pointer dereference at virtual
address 00000028
[  269.784897] [<8020e9c8>] (sock_sendmsg+0x78/0xac) from [<8020ea3c>]
(kernel_sendmsg+0x40/0x7c)
[  269.793614] [<8020ea3c>] (kernel_sendmsg+0x40/0x7c) from [<80289054>]
(rfcomm_send_frame+0x40/0x48)
[  269.802743] [<80289054>] (rfcomm_send_frame+0x40/0x48) from [<802890c4>]
(rfcomm_send_disc+0x68/0x70)
[  269.812045] [<802890c4>] (rfcomm_send_disc+0x68/0x70) from [<80289760>]
(__rfcomm_dlc_close+0x84/0x288)
[  269.821519] [<80289760>] (__rfcomm_dlc_close+0x84/0x288) from [<80289988>]
(rfcomm_dlc_close+0x24/0x3c)
[  269.830995] [<80289988>] (rfcomm_dlc_close+0x24/0x3c) from [<8028b9a4>]
(__rfcomm_sock_close+0x84/0x94)
[  269.840489] [<8028b9a4>] (__rfcomm_sock_close+0x84/0x94) from [<8028b9f0>]
(rfcomm_sock_shutdown+0x3c/0x7c)
[  269.850326] [<8028b9f0>] (rfcomm_sock_shutdown+0x3c/0x7c) from [<8020f008>]
(sys_shutdown+0x34/0x54)
[  269.859554] [<8020f008>] (sys_shutdown+0x34/0x54) from [<8002d0a0>]
(ret_fast_syscall+0x0/0x30)



Testing
-------

Testing including field-trialling and IOP testing has been performed on a
patchset in a multi-core ARM using a 2.6.34 based kernel.

These tested patches have been forward ported to Linux 3.5-RC1 and are released
as a patchset. Basic tests on x86 have been done but it is difficult to
reproduce failure scenarios on a high performance x86 because timing is critical
to cause a malfunction of the unpatched code.

It should be noted that the rfcomm code has changed little between 2.6.34 and
3.5-RC1 with the exception of using process context instead of interrupt
context for all rfcomm core threads. This change may lower the risk of a rfcomm
session refcnt failure, in particular avoiding the rfcomm_security_cfm()
failure of scheduling while atomic.


I welcome discussion on these patches. I am keen to get these changes into
kernel.org. I am willing to re-factor the changes based on feedback from
the community.

I am on IRC in #bluez @ freenode with nick DeanJenkins if you wish to chat
to me about these patches.

Regards,
Dean Jenkins

MontaVista Software, LLC

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] Bluetooth: Remove rfcomm session refcnt
  2012-08-11 18:47 [PATCH 0/4] Bluetooth: Details of rfcomm fixes - remove session refcnt Dean Jenkins
@ 2012-08-11 18:47 ` Dean Jenkins
  2012-08-11 18:47 ` [PATCH 2/4] Bluetooth: Return rfcomm session pointers to avoid freed session Dean Jenkins
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Dean Jenkins @ 2012-08-11 18:47 UTC (permalink / raw)
  To: linux-bluetooth

The rfcomm session refcnt is unworkable as it is hard
to predict the program flow during abnormal protocol conditions
so is easy for the refcnt to be out-of-sync. In addition, high
processor loading can cause the run-time thread order to change
resulting in malfunctioning of the session refcnt eg. premature
session deletion or double session deletion resulting in
kernel crashes.

Therefore, rely on the rfcomm session state machine to absolutely
delete the rfcomm session at the right time. The rfcomm session
state machine is controlled by the DLC data channel connection
states. The rfcomm session is created when a DLC data channel is
required. The rfcomm session is deleted when all DLC data channels
are closed or in abnormal conditions when the socket is BT_CLOSED.

There are 4 connection / disconnection rfcomm session scenarios:
host initiated: host disconnected
host initiated: remote disconnected
remote initiated: host disconnected
remote initiated: remote disconnected

The connection procedures are independent of the disconnection
procedures. Strangely, the session refcnt was applying special
treatment so erroneously combining connection and disconnection
events.

Signed-off-by: Dean Jenkins <djenkins@mvista.com>
---
 include/net/bluetooth/rfcomm.h |    6 -----
 net/bluetooth/rfcomm/core.c    |   56 +++++-----------------------------------
 2 files changed, 6 insertions(+), 56 deletions(-)

diff --git a/include/net/bluetooth/rfcomm.h b/include/net/bluetooth/rfcomm.h
index e2e3eca..7afd419 100644
--- a/include/net/bluetooth/rfcomm.h
+++ b/include/net/bluetooth/rfcomm.h
@@ -158,7 +158,6 @@ struct rfcomm_session {
 	struct timer_list timer;
 	unsigned long    state;
 	unsigned long    flags;
-	atomic_t         refcnt;
 	int              initiator;
 
 	/* Default DLC parameters */
@@ -276,11 +275,6 @@ static inline void rfcomm_dlc_unthrottle(struct rfcomm_dlc *d)
 void   rfcomm_session_getaddr(struct rfcomm_session *s, bdaddr_t *src,
 								bdaddr_t *dst);
 
-static inline void rfcomm_session_hold(struct rfcomm_session *s)
-{
-	atomic_inc(&s->refcnt);
-}
-
 /* ---- RFCOMM sockets ---- */
 struct sockaddr_rc {
 	sa_family_t	rc_family;
diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index 8a60238..b0805c1 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -122,12 +122,6 @@ static inline void rfcomm_schedule(void)
 	wake_up_process(rfcomm_thread);
 }
 
-static inline void rfcomm_session_put(struct rfcomm_session *s)
-{
-	if (atomic_dec_and_test(&s->refcnt))
-		rfcomm_session_del(s);
-}
-
 /* ---- RFCOMM FCS computation ---- */
 
 /* reversed, 8-bit, poly=0x07 */
@@ -263,16 +257,15 @@ static void rfcomm_session_set_timer(struct rfcomm_session *s, long timeout)
 {
 	BT_DBG("session %p state %ld timeout %ld", s, s->state, timeout);
 
-	if (!mod_timer(&s->timer, jiffies + timeout))
-		rfcomm_session_hold(s);
+	mod_timer(&s->timer, jiffies + timeout);
 }
 
 static void rfcomm_session_clear_timer(struct rfcomm_session *s)
 {
 	BT_DBG("session %p state %ld", s, s->state);
 
-	if (timer_pending(&s->timer) && del_timer(&s->timer))
-		rfcomm_session_put(s);
+	if (timer_pending(&s->timer))
+		del_timer(&s->timer);
 }
 
 /* ---- RFCOMM DLCs ---- */
@@ -350,8 +343,6 @@ static void rfcomm_dlc_link(struct rfcomm_session *s, struct rfcomm_dlc *d)
 {
 	BT_DBG("dlc %p session %p", d, s);
 
-	rfcomm_session_hold(s);
-
 	rfcomm_session_clear_timer(s);
 	rfcomm_dlc_hold(d);
 	list_add(&d->list, &s->dlcs);
@@ -370,8 +361,6 @@ static void rfcomm_dlc_unlink(struct rfcomm_dlc *d)
 
 	if (list_empty(&s->dlcs))
 		rfcomm_session_set_timer(s, RFCOMM_IDLE_TIMEOUT);
-
-	rfcomm_session_put(s);
 }
 
 static struct rfcomm_dlc *rfcomm_dlc_get(struct rfcomm_session *s, u8 dlci)
@@ -631,9 +620,6 @@ static void rfcomm_session_del(struct rfcomm_session *s)
 
 	list_del(&s->list);
 
-	if (state == BT_CONNECTED)
-		rfcomm_send_disc(s, 0);
-
 	rfcomm_session_clear_timer(s);
 	sock_release(s->sock);
 	kfree(s);
@@ -665,8 +651,6 @@ static void rfcomm_session_close(struct rfcomm_session *s, int err)
 
 	BT_DBG("session %p state %ld err %d", s, s->state, err);
 
-	rfcomm_session_hold(s);
-
 	s->state = BT_CLOSED;
 
 	/* Close all dlcs */
@@ -676,8 +660,7 @@ static void rfcomm_session_close(struct rfcomm_session *s, int err)
 		__rfcomm_dlc_close(d, err);
 	}
 
-	rfcomm_session_clear_timer(s);
-	rfcomm_session_put(s);
+	rfcomm_session_del(s);
 }
 
 static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
@@ -1164,18 +1147,7 @@ static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
 			break;
 
 		case BT_DISCONN:
-			/* rfcomm_session_put is called later so don't do
-			 * anything here otherwise we will mess up the session
-			 * reference counter:
-			 *
-			 * (a) when we are the initiator dlc_unlink will drive
-			 * the reference counter to 0 (there is no initial put
-			 * after session_add)
-			 *
-			 * (b) when we are not the initiator rfcomm_rx_process
-			 * will explicitly call put to balance the initial hold
-			 * done after session add.
-			 */
+			rfcomm_session_del(s);
 			break;
 		}
 	}
@@ -1875,12 +1847,8 @@ static inline void rfcomm_process_rx(struct rfcomm_session *s)
 			kfree_skb(skb);
 	}
 
-	if (sk->sk_state == BT_CLOSED) {
-		if (!s->initiator)
-			rfcomm_session_put(s);
-
+	if (sk->sk_state == BT_CLOSED)
 		rfcomm_session_close(s, sk->sk_err);
-	}
 }
 
 static inline void rfcomm_accept_connection(struct rfcomm_session *s)
@@ -1905,8 +1873,6 @@ static inline void rfcomm_accept_connection(struct rfcomm_session *s)
 
 	s = rfcomm_session_add(nsock, BT_OPEN);
 	if (s) {
-		rfcomm_session_hold(s);
-
 		/* We should adjust MTU on incoming sessions.
 		 * L2CAP MTU minus UIH header and FCS. */
 		s->mtu = min(l2cap_pi(nsock->sk)->chan->omtu,
@@ -1954,7 +1920,6 @@ static inline void rfcomm_process_sessions(void)
 		if (test_and_clear_bit(RFCOMM_TIMED_OUT, &s->flags)) {
 			s->state = BT_DISCONN;
 			rfcomm_send_disc(s, 0);
-			rfcomm_session_put(s);
 			continue;
 		}
 
@@ -1963,8 +1928,6 @@ static inline void rfcomm_process_sessions(void)
 			continue;
 		}
 
-		rfcomm_session_hold(s);
-
 		switch (s->state) {
 		case BT_BOUND:
 			rfcomm_check_connection(s);
@@ -1976,8 +1939,6 @@ static inline void rfcomm_process_sessions(void)
 		}
 
 		rfcomm_process_dlcs(s);
-
-		rfcomm_session_put(s);
 	}
 
 	rfcomm_unlock();
@@ -2027,7 +1988,6 @@ static int rfcomm_add_listener(bdaddr_t *ba)
 	if (!s)
 		goto failed;
 
-	rfcomm_session_hold(s);
 	return 0;
 failed:
 	sock_release(sock);
@@ -2085,8 +2045,6 @@ static void rfcomm_security_cfm(struct hci_conn *conn, u8 status, u8 encrypt)
 	if (!s)
 		return;
 
-	rfcomm_session_hold(s);
-
 	list_for_each_safe(p, n, &s->dlcs) {
 		d = list_entry(p, struct rfcomm_dlc, list);
 
@@ -2118,8 +2076,6 @@ static void rfcomm_security_cfm(struct hci_conn *conn, u8 status, u8 encrypt)
 			set_bit(RFCOMM_AUTH_REJECT, &d->flags);
 	}
 
-	rfcomm_session_put(s);
-
 	rfcomm_schedule();
 }
 
-- 
1.7.10.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] Bluetooth: Return rfcomm session pointers to avoid freed session
  2012-08-11 18:47 [PATCH 0/4] Bluetooth: Details of rfcomm fixes - remove session refcnt Dean Jenkins
  2012-08-11 18:47 ` [PATCH 1/4] Bluetooth: Remove rfcomm " Dean Jenkins
@ 2012-08-11 18:47 ` Dean Jenkins
  2012-08-11 18:47 ` [PATCH 3/4] Bluetooth: Avoid rfcomm_session_timeout using freed pointer Dean Jenkins
  2012-08-11 18:47 ` [PATCH 4/4] Bluetooth: On socket shutdown check rfcomm session and DLC exists Dean Jenkins
  3 siblings, 0 replies; 9+ messages in thread
From: Dean Jenkins @ 2012-08-11 18:47 UTC (permalink / raw)
  To: linux-bluetooth

Unfortunately, the design retains copies of the s rfcomm session
pointer in various code blocks and this invites the reuse of a
freed session pointer.

Therefore, return the rfcomm session pointer back up the call stack
to avoid reusing a freed rfcomm session pointer. When the rfcomm
session is deleted, NULL is passed up the call stack.

Signed-off-by: Dean Jenkins <djenkins@mvista.com>
---
 net/bluetooth/rfcomm/core.c |   88 +++++++++++++++++++++++++++----------------
 1 file changed, 55 insertions(+), 33 deletions(-)

diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index b0805c1..24d4d3c 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -83,7 +83,7 @@ static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
 							u8 sec_level,
 							int *err);
 static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst);
-static void rfcomm_session_del(struct rfcomm_session *s);
+static struct rfcomm_session *rfcomm_session_del(struct rfcomm_session *s);
 
 /* ---- RFCOMM frame parsing macros ---- */
 #define __get_dlci(b)     ((b & 0xfc) >> 2)
@@ -612,9 +612,13 @@ static struct rfcomm_session *rfcomm_session_add(struct socket *sock, int state)
 	return s;
 }
 
-static void rfcomm_session_del(struct rfcomm_session *s)
+static struct rfcomm_session *rfcomm_session_del(struct rfcomm_session *s)
 {
-	int state = s->state;
+	int state;
+
+	BUG_ON(s == NULL);
+
+	state = s->state;
 
 	BT_DBG("session %p state %ld", s, s->state);
 
@@ -626,6 +630,8 @@ static void rfcomm_session_del(struct rfcomm_session *s)
 
 	if (state != BT_LISTEN)
 		module_put(THIS_MODULE);
+
+	return NULL;
 }
 
 static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst)
@@ -644,11 +650,14 @@ static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst)
 	return NULL;
 }
 
-static void rfcomm_session_close(struct rfcomm_session *s, int err)
+static struct rfcomm_session *rfcomm_session_close(struct rfcomm_session *s,
+							int err)
 {
 	struct rfcomm_dlc *d;
 	struct list_head *p, *n;
 
+	BUG_ON(s == NULL);
+
 	BT_DBG("session %p state %ld err %d", s, s->state, err);
 
 	s->state = BT_CLOSED;
@@ -660,7 +669,7 @@ static void rfcomm_session_close(struct rfcomm_session *s, int err)
 		__rfcomm_dlc_close(d, err);
 	}
 
-	rfcomm_session_del(s);
+	return rfcomm_session_del(s);
 }
 
 static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
@@ -712,8 +721,7 @@ static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src,
 	if (*err == 0 || *err == -EINPROGRESS)
 		return s;
 
-	rfcomm_session_del(s);
-	return NULL;
+	return rfcomm_session_del(s);
 
 failed:
 	sock_release(sock);
@@ -1102,7 +1110,7 @@ static void rfcomm_make_uih(struct sk_buff *skb, u8 addr)
 }
 
 /* ---- RFCOMM frame reception ---- */
-static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
+static struct rfcomm_session *rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
 {
 	BT_DBG("session %p state %ld dlci %d", s, s->state, dlci);
 
@@ -1111,7 +1119,7 @@ static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
 		struct rfcomm_dlc *d = rfcomm_dlc_get(s, dlci);
 		if (!d) {
 			rfcomm_send_dm(s, dlci);
-			return 0;
+			return s;
 		}
 
 		switch (d->state) {
@@ -1147,14 +1155,14 @@ static int rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
 			break;
 
 		case BT_DISCONN:
-			rfcomm_session_del(s);
+			s = rfcomm_session_del(s);
 			break;
 		}
 	}
-	return 0;
+	return s;
 }
 
-static int rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
+static struct rfcomm_session *rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
 {
 	int err = 0;
 
@@ -1179,12 +1187,13 @@ static int rfcomm_recv_dm(struct rfcomm_session *s, u8 dlci)
 			err = ECONNRESET;
 
 		s->state = BT_CLOSED;
-		rfcomm_session_close(s, err);
+		s = rfcomm_session_close(s, err);
 	}
-	return 0;
+	return s;
 }
 
-static int rfcomm_recv_disc(struct rfcomm_session *s, u8 dlci)
+static struct rfcomm_session *rfcomm_recv_disc(struct rfcomm_session *s,
+						u8 dlci)
 {
 	int err = 0;
 
@@ -1214,10 +1223,10 @@ static int rfcomm_recv_disc(struct rfcomm_session *s, u8 dlci)
 			err = ECONNRESET;
 
 		s->state = BT_CLOSED;
-		rfcomm_session_close(s, err);
+		s = rfcomm_session_close(s, err);
 	}
 
-	return 0;
+	return s;
 }
 
 void rfcomm_dlc_accept(struct rfcomm_dlc *d)
@@ -1638,11 +1647,18 @@ drop:
 	return 0;
 }
 
-static int rfcomm_recv_frame(struct rfcomm_session *s, struct sk_buff *skb)
+static struct rfcomm_session *rfcomm_recv_frame(struct rfcomm_session *s,
+						struct sk_buff *skb)
 {
 	struct rfcomm_hdr *hdr = (void *) skb->data;
 	u8 type, dlci, fcs;
 
+	if (!s) {
+		/* no session, so free socket data */
+		kfree_skb(skb);
+		return s;
+	}
+
 	dlci = __get_dlci(hdr->addr);
 	type = __get_type(hdr->ctrl);
 
@@ -1653,7 +1669,7 @@ static int rfcomm_recv_frame(struct rfcomm_session *s, struct sk_buff *skb)
 	if (__check_fcs(skb->data, type, fcs)) {
 		BT_ERR("bad checksum in packet");
 		kfree_skb(skb);
-		return -EILSEQ;
+		return s;
 	}
 
 	if (__test_ea(hdr->len))
@@ -1669,22 +1685,23 @@ static int rfcomm_recv_frame(struct rfcomm_session *s, struct sk_buff *skb)
 
 	case RFCOMM_DISC:
 		if (__test_pf(hdr->ctrl))
-			rfcomm_recv_disc(s, dlci);
+			s = rfcomm_recv_disc(s, dlci);
 		break;
 
 	case RFCOMM_UA:
 		if (__test_pf(hdr->ctrl))
-			rfcomm_recv_ua(s, dlci);
+			s = rfcomm_recv_ua(s, dlci);
 		break;
 
 	case RFCOMM_DM:
-		rfcomm_recv_dm(s, dlci);
+		s = rfcomm_recv_dm(s, dlci);
 		break;
 
 	case RFCOMM_UIH:
-		if (dlci)
-			return rfcomm_recv_data(s, dlci, __test_pf(hdr->ctrl), skb);
-
+		if (dlci) {
+			rfcomm_recv_data(s, dlci, __test_pf(hdr->ctrl), skb);
+			return s;
+		}
 		rfcomm_recv_mcc(s, skb);
 		break;
 
@@ -1693,7 +1710,7 @@ static int rfcomm_recv_frame(struct rfcomm_session *s, struct sk_buff *skb)
 		break;
 	}
 	kfree_skb(skb);
-	return 0;
+	return s;
 }
 
 /* ---- Connection and data processing ---- */
@@ -1775,6 +1792,8 @@ static inline void rfcomm_process_dlcs(struct rfcomm_session *s)
 	struct rfcomm_dlc *d;
 	struct list_head *p, *n;
 
+	BUG_ON(s == NULL);
+
 	BT_DBG("session %p state %ld", s, s->state);
 
 	list_for_each_safe(p, n, &s->dlcs) {
@@ -1830,7 +1849,7 @@ static inline void rfcomm_process_dlcs(struct rfcomm_session *s)
 	}
 }
 
-static inline void rfcomm_process_rx(struct rfcomm_session *s)
+static inline struct rfcomm_session *rfcomm_process_rx(struct rfcomm_session *s)
 {
 	struct socket *sock = s->sock;
 	struct sock *sk = sock->sk;
@@ -1842,13 +1861,15 @@ static inline void rfcomm_process_rx(struct rfcomm_session *s)
 	while ((skb = skb_dequeue(&sk->sk_receive_queue))) {
 		skb_orphan(skb);
 		if (!skb_linearize(skb))
-			rfcomm_recv_frame(s, skb);
+			s = rfcomm_recv_frame(s, skb);
 		else
 			kfree_skb(skb);
 	}
 
-	if (sk->sk_state == BT_CLOSED)
-		rfcomm_session_close(s, sk->sk_err);
+	if (s && (sk->sk_state == BT_CLOSED))
+		s = rfcomm_session_close(s, sk->sk_err);
+
+	return s;
 }
 
 static inline void rfcomm_accept_connection(struct rfcomm_session *s)
@@ -1902,7 +1923,7 @@ static inline void rfcomm_check_connection(struct rfcomm_session *s)
 
 	case BT_CLOSED:
 		s->state = BT_CLOSED;
-		rfcomm_session_close(s, sk->sk_err);
+		s = rfcomm_session_close(s, sk->sk_err);
 		break;
 	}
 }
@@ -1934,11 +1955,12 @@ static inline void rfcomm_process_sessions(void)
 			break;
 
 		default:
-			rfcomm_process_rx(s);
+			s = rfcomm_process_rx(s);
 			break;
 		}
 
-		rfcomm_process_dlcs(s);
+		if (s)
+			rfcomm_process_dlcs(s);
 	}
 
 	rfcomm_unlock();
-- 
1.7.10.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] Bluetooth: Avoid rfcomm_session_timeout using freed pointer
  2012-08-11 18:47 [PATCH 0/4] Bluetooth: Details of rfcomm fixes - remove session refcnt Dean Jenkins
  2012-08-11 18:47 ` [PATCH 1/4] Bluetooth: Remove rfcomm " Dean Jenkins
  2012-08-11 18:47 ` [PATCH 2/4] Bluetooth: Return rfcomm session pointers to avoid freed session Dean Jenkins
@ 2012-08-11 18:47 ` Dean Jenkins
  2012-08-21 18:56   ` Gustavo Padovan
  2012-08-11 18:47 ` [PATCH 4/4] Bluetooth: On socket shutdown check rfcomm session and DLC exists Dean Jenkins
  3 siblings, 1 reply; 9+ messages in thread
From: Dean Jenkins @ 2012-08-11 18:47 UTC (permalink / raw)
  To: linux-bluetooth

rfcomm_session_timeout() protects the scenario of the remote
Bluetooth device failing to send a DISC on the rfcomm control
channel after the last data DLC channel has been closed.

There is a race condition between the timer expiring causing
rfcomm_session_timeout() to run and the rfcomm session being
deleted. If the rfcomm session is deleted then
rfcomm_session_timeout() would use a freed rfcomm session
pointer resulting in a potential kernel crash or memory corruption.
Note the timer is cleared before the rfcomm session is deleted
by del_timer() so the circumstances for a failure to occur are
as follows:

rfcomm_session_timeout() needs to be executing before the
del_timer() is called to clear the timer but
rfcomm_session_timeout() needs to be delayed from using the
rfcomm session pointer until after the session has been deleted.
Therefore, there is a very small window of opportunity for failure.

The solution is to use del_timer_sync() instead of del_timer()
as this ensures that rfcomm_session_timeout() is not running
when del_timer_sync() returns. This means that it is not
possible for rfcomm_session_timeout() to run after the rfcomm
session has been deleted.

Signed-off-by: Dean Jenkins <djenkins@mvista.com>
---
 net/bluetooth/rfcomm/core.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index 24d4d3c..c7921fd 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -264,8 +264,8 @@ static void rfcomm_session_clear_timer(struct rfcomm_session *s)
 {
 	BT_DBG("session %p state %ld", s, s->state);
 
-	if (timer_pending(&s->timer))
-		del_timer(&s->timer);
+	/* ensure rfcomm_session_timeout() is not running past this point */
+	del_timer_sync(&s->timer);
 }
 
 /* ---- RFCOMM DLCs ---- */
-- 
1.7.10.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] Bluetooth: On socket shutdown check rfcomm session and DLC exists
  2012-08-11 18:47 [PATCH 0/4] Bluetooth: Details of rfcomm fixes - remove session refcnt Dean Jenkins
                   ` (2 preceding siblings ...)
  2012-08-11 18:47 ` [PATCH 3/4] Bluetooth: Avoid rfcomm_session_timeout using freed pointer Dean Jenkins
@ 2012-08-11 18:47 ` Dean Jenkins
  2012-08-21 18:50   ` Gustavo Padovan
  3 siblings, 1 reply; 9+ messages in thread
From: Dean Jenkins @ 2012-08-11 18:47 UTC (permalink / raw)
  To: linux-bluetooth

A race condition exists between near simultaneous asynchronous
DLC data channel disconnection requests from the host and remote device.
This causes the socket layer to request a socket shutdown at the same time
the rfcomm core is processing the disconnect request from the remote
device.

The socket layer retains a copy of a struct rfcomm_dlc d pointer.
The d pointer refers to a copy of a struct rfcomm_session.
When the socket layer thread performs a socket shutdown, the thread
may wait on a rfcomm lock in rfcomm_dlc_close(). This means that
whilst the thread waits, the rfcomm_session and/or rfcomm_dlc structures
pointed to by d maybe freed due to rfcomm core handling. Consequently,
when the rfcomm lock becomes available and the thread runs, a
malfunction could occur as a freed rfcomm_session pointer and/or a
freed d pointer will be erroneously reused.

Therefore, after the rfcomm lock is acquired, check that the struct
rfcomm_session is still valid by searching the rfcomm session list.
If the session is valid then validate the d pointer by searching the
rfcomm session list of active DLCs for the rfcomm_dlc structure
pointed by d.

If active DLCs exist when the rfcomm session is terminating,
avoid a memory leak of rfcomm_dlc structures by ensuring that
rfcomm_session_close() is used instead of rfcomm_session_del().

Signed-off-by: Dean Jenkins <djenkins@mvista.com>
---
 net/bluetooth/rfcomm/core.c |   29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
index c7921fd..1a7db34 100644
--- a/net/bluetooth/rfcomm/core.c
+++ b/net/bluetooth/rfcomm/core.c
@@ -496,11 +496,34 @@ static int __rfcomm_dlc_close(struct rfcomm_dlc *d, int err)
 
 int rfcomm_dlc_close(struct rfcomm_dlc *d, int err)
 {
-	int r;
+	int r = 0;
+	struct rfcomm_dlc *d_list;
+	struct rfcomm_session *s, *s_list;
+	struct list_head *p, *n, *p2, *n2;
 
 	rfcomm_lock();
 
-	r = __rfcomm_dlc_close(d, err);
+	s = d->session;
+	if (s) {
+		/* check the session still exists after waiting on the mutex */
+		list_for_each_safe(p, n, &session_list) {
+			s_list = list_entry(p, struct rfcomm_session, list);
+			if (s == s_list) {
+				/* check the dlc still exists */
+				/* after waiting on the mutex */
+				list_for_each_safe(p2, n2, &s->dlcs) {
+					d_list = list_entry(p2,
+							struct rfcomm_dlc,
+							list);
+					if (d == d_list) {
+						r = __rfcomm_dlc_close(d, err);
+						break;
+					}
+				}
+				break;
+			}
+		}
+	}
 
 	rfcomm_unlock();
 	return r;
@@ -1155,7 +1178,7 @@ static struct rfcomm_session *rfcomm_recv_ua(struct rfcomm_session *s, u8 dlci)
 			break;
 
 		case BT_DISCONN:
-			s = rfcomm_session_del(s);
+			s = rfcomm_session_close(s, ECONNRESET);
 			break;
 		}
 	}
-- 
1.7.10.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/4] Bluetooth: On socket shutdown check rfcomm session and DLC exists
  2012-08-11 18:47 ` [PATCH 4/4] Bluetooth: On socket shutdown check rfcomm session and DLC exists Dean Jenkins
@ 2012-08-21 18:50   ` Gustavo Padovan
  2012-08-30 15:40     ` Dean Jenkins
  0 siblings, 1 reply; 9+ messages in thread
From: Gustavo Padovan @ 2012-08-21 18:50 UTC (permalink / raw)
  To: Dean Jenkins; +Cc: linux-bluetooth

Hi Dean,

* Dean Jenkins <djenkins@mvista.com> [2012-08-11 19:47:10 +0100]:

> A race condition exists between near simultaneous asynchronous
> DLC data channel disconnection requests from the host and remote device.
> This causes the socket layer to request a socket shutdown at the same time
> the rfcomm core is processing the disconnect request from the remote
> device.
> 
> The socket layer retains a copy of a struct rfcomm_dlc d pointer.
> The d pointer refers to a copy of a struct rfcomm_session.
> When the socket layer thread performs a socket shutdown, the thread
> may wait on a rfcomm lock in rfcomm_dlc_close(). This means that
> whilst the thread waits, the rfcomm_session and/or rfcomm_dlc structures
> pointed to by d maybe freed due to rfcomm core handling. Consequently,
> when the rfcomm lock becomes available and the thread runs, a
> malfunction could occur as a freed rfcomm_session pointer and/or a
> freed d pointer will be erroneously reused.
> 
> Therefore, after the rfcomm lock is acquired, check that the struct
> rfcomm_session is still valid by searching the rfcomm session list.
> If the session is valid then validate the d pointer by searching the
> rfcomm session list of active DLCs for the rfcomm_dlc structure
> pointed by d.
> 
> If active DLCs exist when the rfcomm session is terminating,
> avoid a memory leak of rfcomm_dlc structures by ensuring that
> rfcomm_session_close() is used instead of rfcomm_session_del().
> 
> Signed-off-by: Dean Jenkins <djenkins@mvista.com>
> ---
>  net/bluetooth/rfcomm/core.c |   29 ++++++++++++++++++++++++++---
>  1 file changed, 26 insertions(+), 3 deletions(-)
> 
> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> index c7921fd..1a7db34 100644
> --- a/net/bluetooth/rfcomm/core.c
> +++ b/net/bluetooth/rfcomm/core.c
> @@ -496,11 +496,34 @@ static int __rfcomm_dlc_close(struct rfcomm_dlc *d, int err)
>  
>  int rfcomm_dlc_close(struct rfcomm_dlc *d, int err)
>  {
> -	int r;
> +	int r = 0;
> +	struct rfcomm_dlc *d_list;
> +	struct rfcomm_session *s, *s_list;
> +	struct list_head *p, *n, *p2, *n2;
>  
>  	rfcomm_lock();
>  
> -	r = __rfcomm_dlc_close(d, err);
> +	s = d->session;
> +	if (s) {

Please invert the check here, and add a goto unlock;. There too many
indentation levels here.

> +		/* check the session still exists after waiting on the mutex */
> +		list_for_each_safe(p, n, &session_list) {
> +			s_list = list_entry(p, struct rfcomm_session, list);

please use list_for_each_entry(), the _safe version seems to not be needed.

> +			if (s == s_list) {
> +				/* check the dlc still exists */
> +				/* after waiting on the mutex */
> +				list_for_each_safe(p2, n2, &s->dlcs) {
> +					d_list = list_entry(p2,
> +							struct rfcomm_dlc,
> +							list);


and here you can use rfcomm_dlc_get()

	Gustavo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/4] Bluetooth: Avoid rfcomm_session_timeout using freed pointer
  2012-08-11 18:47 ` [PATCH 3/4] Bluetooth: Avoid rfcomm_session_timeout using freed pointer Dean Jenkins
@ 2012-08-21 18:56   ` Gustavo Padovan
  2012-08-30 15:36     ` Dean Jenkins
  0 siblings, 1 reply; 9+ messages in thread
From: Gustavo Padovan @ 2012-08-21 18:56 UTC (permalink / raw)
  To: Dean Jenkins; +Cc: linux-bluetooth

Hi Dean,

* Dean Jenkins <djenkins@mvista.com> [2012-08-11 19:47:09 +0100]:

> rfcomm_session_timeout() protects the scenario of the remote
> Bluetooth device failing to send a DISC on the rfcomm control
> channel after the last data DLC channel has been closed.
> 
> There is a race condition between the timer expiring causing
> rfcomm_session_timeout() to run and the rfcomm session being
> deleted. If the rfcomm session is deleted then
> rfcomm_session_timeout() would use a freed rfcomm session
> pointer resulting in a potential kernel crash or memory corruption.
> Note the timer is cleared before the rfcomm session is deleted
> by del_timer() so the circumstances for a failure to occur are
> as follows:
> 
> rfcomm_session_timeout() needs to be executing before the
> del_timer() is called to clear the timer but
> rfcomm_session_timeout() needs to be delayed from using the
> rfcomm session pointer until after the session has been deleted.
> Therefore, there is a very small window of opportunity for failure.
> 
> The solution is to use del_timer_sync() instead of del_timer()
> as this ensures that rfcomm_session_timeout() is not running
> when del_timer_sync() returns. This means that it is not
> possible for rfcomm_session_timeout() to run after the rfcomm
> session has been deleted.

> 
> Signed-off-by: Dean Jenkins <djenkins@mvista.com>
> ---
>  net/bluetooth/rfcomm/core.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
> index 24d4d3c..c7921fd 100644
> --- a/net/bluetooth/rfcomm/core.c
> +++ b/net/bluetooth/rfcomm/core.c
> @@ -264,8 +264,8 @@ static void rfcomm_session_clear_timer(struct rfcomm_session *s)
>  {
>  	BT_DBG("session %p state %ld", s, s->state);
>  
> -	if (timer_pending(&s->timer))
> -		del_timer(&s->timer);
> +	/* ensure rfcomm_session_timeout() is not running past this point */
> +	del_timer_sync(&s->timer);

I'm not happy of the idea of let the stack broken between patches 2 and 3
(this one). As you said if we use del_timer_sync() we don't need rfcnt here,
can you add this as a first patch, maybe? and after this continue to remove
the rest of the refcount code?

	Gustavo

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/4] Bluetooth: Avoid rfcomm_session_timeout using freed pointer
  2012-08-21 18:56   ` Gustavo Padovan
@ 2012-08-30 15:36     ` Dean Jenkins
  0 siblings, 0 replies; 9+ messages in thread
From: Dean Jenkins @ 2012-08-30 15:36 UTC (permalink / raw)
  To: Gustavo Padovan, Dean Jenkins, linux-bluetooth

Hi Gustavo,

On 21 August 2012 19:56, Gustavo Padovan <gustavo@padovan.org> wrote:
> Hi Dean,
>
> * Dean Jenkins <djenkins@mvista.com> [2012-08-11 19:47:09 +0100]:
>
>> rfcomm_session_timeout() protects the scenario of the remote
>> Bluetooth device failing to send a DISC on the rfcomm control
>> channel after the last data DLC channel has been closed.
>>
>> There is a race condition between the timer expiring causing
>> rfcomm_session_timeout() to run and the rfcomm session being
>> deleted. If the rfcomm session is deleted then
>> rfcomm_session_timeout() would use a freed rfcomm session
>> pointer resulting in a potential kernel crash or memory corruption.
>> Note the timer is cleared before the rfcomm session is deleted
>> by del_timer() so the circumstances for a failure to occur are
>> as follows:
>>
>> rfcomm_session_timeout() needs to be executing before the
>> del_timer() is called to clear the timer but
>> rfcomm_session_timeout() needs to be delayed from using the
>> rfcomm session pointer until after the session has been deleted.
>> Therefore, there is a very small window of opportunity for failure.
>>
>> The solution is to use del_timer_sync() instead of del_timer()
>> as this ensures that rfcomm_session_timeout() is not running
>> when del_timer_sync() returns. This means that it is not
>> possible for rfcomm_session_timeout() to run after the rfcomm
>> session has been deleted.
>
>>
>> Signed-off-by: Dean Jenkins <djenkins@mvista.com>
>> ---
>>  net/bluetooth/rfcomm/core.c |    4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
>> index 24d4d3c..c7921fd 100644
>> --- a/net/bluetooth/rfcomm/core.c
>> +++ b/net/bluetooth/rfcomm/core.c
>> @@ -264,8 +264,8 @@ static void rfcomm_session_clear_timer(struct rfcomm_session *s)
>>  {
>>       BT_DBG("session %p state %ld", s, s->state);
>>
>> -     if (timer_pending(&s->timer))
>> -             del_timer(&s->timer);
>> +     /* ensure rfcomm_session_timeout() is not running past this point */
>> +     del_timer_sync(&s->timer);
>
> I'm not happy of the idea of let the stack broken between patches 2 and 3
> (this one). As you said if we use del_timer_sync() we don't need rfcnt here,
> can you add this as a first patch, maybe? and after this continue to remove
> the rest of the refcount code?
>
>         Gustavo

Thanks for your feedback. Sorry for the delay in getting back to you.
My gmail failed to filter out your reply.

OK, I'll start with a patch for del_timer_sync(). This can be a
standalone patch.

For the removal of the refcnt, do you propose 1 patch to remove the
refcnt AND to manage the rfcomm session pointer ? I have doubts now
because you don't wish the stack to be broken between patches. Perhaps
it is possible to add a patch to manage the rfcomm session pointer
with the refcnt in place so we have "belt and braces" then have a
patch to remove the refcnt as the last thing to do.

I am open to suggestions.

Thanks,

Regards,
Dean

-- 
Dean Jenkins
Embedded Software Engineer
Professional Services UK/EMEA
MontaVista Software, LLC

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/4] Bluetooth: On socket shutdown check rfcomm session and DLC exists
  2012-08-21 18:50   ` Gustavo Padovan
@ 2012-08-30 15:40     ` Dean Jenkins
  0 siblings, 0 replies; 9+ messages in thread
From: Dean Jenkins @ 2012-08-30 15:40 UTC (permalink / raw)
  To: Gustavo Padovan, Dean Jenkins, linux-bluetooth

Hi Gustavo,

On 21 August 2012 19:50, Gustavo Padovan <gustavo@padovan.org> wrote:
> Hi Dean,
>
> * Dean Jenkins <djenkins@mvista.com> [2012-08-11 19:47:10 +0100]:
>
>> A race condition exists between near simultaneous asynchronous
>> DLC data channel disconnection requests from the host and remote device.
>> This causes the socket layer to request a socket shutdown at the same time
>> the rfcomm core is processing the disconnect request from the remote
>> device.
>>
>> The socket layer retains a copy of a struct rfcomm_dlc d pointer.
>> The d pointer refers to a copy of a struct rfcomm_session.
>> When the socket layer thread performs a socket shutdown, the thread
>> may wait on a rfcomm lock in rfcomm_dlc_close(). This means that
>> whilst the thread waits, the rfcomm_session and/or rfcomm_dlc structures
>> pointed to by d maybe freed due to rfcomm core handling. Consequently,
>> when the rfcomm lock becomes available and the thread runs, a
>> malfunction could occur as a freed rfcomm_session pointer and/or a
>> freed d pointer will be erroneously reused.
>>
>> Therefore, after the rfcomm lock is acquired, check that the struct
>> rfcomm_session is still valid by searching the rfcomm session list.
>> If the session is valid then validate the d pointer by searching the
>> rfcomm session list of active DLCs for the rfcomm_dlc structure
>> pointed by d.
>>
>> If active DLCs exist when the rfcomm session is terminating,
>> avoid a memory leak of rfcomm_dlc structures by ensuring that
>> rfcomm_session_close() is used instead of rfcomm_session_del().
>>
>> Signed-off-by: Dean Jenkins <djenkins@mvista.com>
>> ---
>>  net/bluetooth/rfcomm/core.c |   29 ++++++++++++++++++++++++++---
>>  1 file changed, 26 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/bluetooth/rfcomm/core.c b/net/bluetooth/rfcomm/core.c
>> index c7921fd..1a7db34 100644
>> --- a/net/bluetooth/rfcomm/core.c
>> +++ b/net/bluetooth/rfcomm/core.c
>> @@ -496,11 +496,34 @@ static int __rfcomm_dlc_close(struct rfcomm_dlc *d, int err)
>>
>>  int rfcomm_dlc_close(struct rfcomm_dlc *d, int err)
>>  {
>> -     int r;
>> +     int r = 0;
>> +     struct rfcomm_dlc *d_list;
>> +     struct rfcomm_session *s, *s_list;
>> +     struct list_head *p, *n, *p2, *n2;
>>
>>       rfcomm_lock();
>>
>> -     r = __rfcomm_dlc_close(d, err);
>> +     s = d->session;
>> +     if (s) {
>
> Please invert the check here, and add a goto unlock;. There too many
> indentation levels here.
>
>> +             /* check the session still exists after waiting on the mutex */
>> +             list_for_each_safe(p, n, &session_list) {
>> +                     s_list = list_entry(p, struct rfcomm_session, list);
>
> please use list_for_each_entry(), the _safe version seems to not be needed.
>
>> +                     if (s == s_list) {
>> +                             /* check the dlc still exists */
>> +                             /* after waiting on the mutex */
>> +                             list_for_each_safe(p2, n2, &s->dlcs) {
>> +                                     d_list = list_entry(p2,
>> +                                                     struct rfcomm_dlc,
>> +                                                     list);
>
>
> and here you can use rfcomm_dlc_get()
>
>         Gustavo

Thanks for your feedback. Sorry for the delay getting back to you. I
blame gmail.

OK, I will take a look at redoing the patch as per your suggestion.

Thanks,

Regards,
Dean

-- 
Dean Jenkins
Embedded Software Engineer
Professional Services UK/EMEA
MontaVista Software, LLC

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-08-30 15:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-11 18:47 [PATCH 0/4] Bluetooth: Details of rfcomm fixes - remove session refcnt Dean Jenkins
2012-08-11 18:47 ` [PATCH 1/4] Bluetooth: Remove rfcomm " Dean Jenkins
2012-08-11 18:47 ` [PATCH 2/4] Bluetooth: Return rfcomm session pointers to avoid freed session Dean Jenkins
2012-08-11 18:47 ` [PATCH 3/4] Bluetooth: Avoid rfcomm_session_timeout using freed pointer Dean Jenkins
2012-08-21 18:56   ` Gustavo Padovan
2012-08-30 15:36     ` Dean Jenkins
2012-08-11 18:47 ` [PATCH 4/4] Bluetooth: On socket shutdown check rfcomm session and DLC exists Dean Jenkins
2012-08-21 18:50   ` Gustavo Padovan
2012-08-30 15:40     ` Dean Jenkins

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.