All of lore.kernel.org
 help / color / mirror / Atom feed
* Kernel panic when handing Motorola S305 headset
@ 2010-01-22 10:12 Liang Bao
       [not found] ` <6aeb672b1001272156m678aa8fepec2498386947936a@mail.gmail.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Liang Bao @ 2010-01-22 10:12 UTC (permalink / raw)
  To: linux-bluetooth

[-- Attachment #1: Type: text/plain, Size: 9416 bytes --]

Hello,

Recently we found an almost 100% reproducible issue when using
Motorola S305 stereo headset and Ubuntu 9.10. The Ubuntu system has
auto-update turned on and keeps up with the latest stable release. At
this point, the system is using:

bash$dpkg -l | grep bluez
ii  bluez                                      4.51-0ubuntu2
                   Bluetooth tools and daemons
ii  bluez-alsa                                 4.51-0ubuntu2
                   Bluetooth audio support
ii  bluez-compat                               4.51-0ubuntu2
                   BlueZ 3.x compatibility binaries
ii  bluez-cups                                 4.51-0ubuntu2
                   Bluetooth printer driver for CUPS
rc  bluez-gnome                                1.8-0ubuntu5
                   Bluetooth utilities for GNOME
ii  bluez-gstreamer                            4.51-0ubuntu2
                   Bluetooth GStreamer support
ii  bluez-hcidump                              1.42-1build1
                   Analyses Bluetooth HCI packets
ii  bluez-utils                                4.51-0ubuntu2
                   Transitional package

bash$uname -a
Linux wahoo 2.6.31-17-generic #54-Ubuntu SMP Thu Dec 10 16:20:31 UTC
2009 i686 GNU/Linux

The pattern to reproduce the issue is:
1. Pair the S305 headset from Ubuntu using the bluetooth icon in the
task bar, "Set up new device..."
2. In the "Bluetooth Preferences" screen, hit "Remove" button to
remove the paired S305.
3. Power off and then power on S305.
4. The S305 will try to connect and get a negative link key reply.
You'll see the PIN input dialog flash away quickly because after the
negative reply, the LMP link is detached.
5. After step 4, the S305 willl request again and this time you'll be
given the dialog to input the PIN. Put '0000' and you'll see the
desktop frozen. Keyboard, mouse, network ... everything is down.

This phenomenon is quite similar with what Lan Zhu described in last
September(see the end of this mail). Although Ubuntu does't record the
panic dump by default, we did some analysis based on the hcidump log
and the sniffer log. Here's our thought.

The S305 is a stereo headset which supports both A2DP and HFP. When I
power cycle it in step 3, it will try to connect but as the Ubuntu
system has removed it from paired list, connection will fail as
mentioned in step 4 above(see the packets at line 74-115 in the
attached log).

After that, the S305 will try again to initiate a pair, which can be
reflected by messages from line 116 to 164. You can find pair succeeds
and encryption starts.

Then comes our weird thing. The S305 seems remember the capability of
the Ubuntu system and directly try to connect AVDTP and SDP almost at
the same time(line 165,166,172,173). For both L2CAP connection
request, the Ubuntu system responds with "connection pending"(167-169
and 174-176). Shortly the channel for SDP is ready(line 190-192) and
an "authentication pending" response is sent to the headset (line
193-195) for the pending AVDTP connection request. After that the
desktop freezes and network interfaces(I mean ssh here) is down, which
I believe kernel panic happens at this point. The only thing I can do
is to press the power button on the PC case.

With the Droid+S305 problemthat Lan Zhu mentioned in his mail, I
checked into the kernel code carefully. I admit that I can't
understand the kernel part well and therefore below is just my guess.
I'd like to share to see if it can help. The two L2CAP connection
arrive in a quite short interval. These two requests are put into a
link list by bt_accept_enqueue() and l2cap_do_start() will send out
L2CAP_INFO_REQ for both. If the SDP connection is accepted first,
bt_accept_dequeue() will remove it from the link list. This probably
caused the AVDTP entry has its parent as NULL. When AVDTP connection
is accepted in a while, l2cap_conn_start() will run into problem here
becuse the AVDTP is deferred too:

				if (bt_sk(sk)->defer_setup) {
					struct sock *parent = bt_sk(sk)->parent;
					rsp.result = cpu_to_le16(L2CAP_CR_PEND);
					rsp.status = cpu_to_le16(L2CAP_CS_AUTHOR_PEND);
					parent->sk_data_ready(parent, 0);

				}

Again, this is still my guess and could be in wrong direction.
However, the issue is quite easy to be reproduced with the combination
of S305 and Ubuntu 9.10(at least the versions I mentioned above). Is
this also a kind of race condition happening with those weird devices
try to make L2CAP connections simultaneously instead of one by one?
Could someone suggest a fix to prevent the kernel panic in this
scenario?

BTW, the SDP record of my Ubuntu system is listed here.

bash$sdptool records local
Service Name: BlueZ PANU service
Service Description: BlueZ PAN service
Service RecHandle: 0x10000
Service Class ID List:
  "PAN User" (0x1115)
Protocol Descriptor List:
  "L2CAP" (0x0100)
    PSM: 15
  "BNEP" (0x000f)
    Version: 0x0100
    SEQ16: 800 806
Language Base Attr List:
  code_ISO639: 0x656e
  encoding:    0x6a
  base_offset: 0x100
Profile Descriptor List:
  "PAN User" (0x1115)
    Version: 0x0100

Service Name: BlueZ GN service
Service Description: BlueZ PAN service
Service RecHandle: 0x10001
Service Class ID List:
  "PAN Group Network" (0x1117)
Protocol Descriptor List:
  "L2CAP" (0x0100)
    PSM: 15
  "BNEP" (0x000f)
    Version: 0x0100
    SEQ16: 800 806
Language Base Attr List:
  code_ISO639: 0x656e
  encoding:    0x6a
  base_offset: 0x100
Profile Descriptor List:
  "PAN Group Network" (0x1117)
    Version: 0x0100

Service Name: BlueZ NAP service
Service Description: BlueZ PAN service
Service RecHandle: 0x10002
Service Class ID List:
  "Network Access Point" (0x1116)
Protocol Descriptor List:
  "L2CAP" (0x0100)
    PSM: 15
  "BNEP" (0x000f)
    Version: 0x0100
    SEQ16: 800 806
Language Base Attr List:
  code_ISO639: 0x656e
  encoding:    0x6a
  base_offset: 0x100
Profile Descriptor List:
  "Network Access Point" (0x1116)
    Version: 0x0100

Service Name: Headset Audio Gateway
Service RecHandle: 0x10003
Service Class ID List:
  "Headset Audio Gateway" (0x1112)
  "Generic Audio" (0x1203)
Protocol Descriptor List:
  "L2CAP" (0x0100)
  "RFCOMM" (0x0003)
    Channel: 12
Profile Descriptor List:
  "Headset" (0x1108)
    Version: 0x0102

Service Name: Hands-Free Audio Gateway
Service RecHandle: 0x10004
Service Class ID List:
  "Handsfree Audio Gateway" (0x111f)
  "Generic Audio" (0x1203)
Protocol Descriptor List:
  "L2CAP" (0x0100)
  "RFCOMM" (0x0003)
    Channel: 13
Profile Descriptor List:
  "Handsfree" (0x111e)
    Version: 0x0105

Service Name: Audio Source
Service RecHandle: 0x10005
Service Class ID List:
  "Audio Source" (0x110a)
Protocol Descriptor List:
  "L2CAP" (0x0100)
    PSM: 25
  "AVDTP" (0x0019)
    uint16: 0x100
Profile Descriptor List:
  "Advanced Audio" (0x110d)
    Version: 0x0100

Service Name: AVRCP TG
Service RecHandle: 0x10006
Service Class ID List:
  "AV Remote Target" (0x110c)
Protocol Descriptor List:
  "L2CAP" (0x0100)
    PSM: 23
  "AVCTP" (0x0017)
    uint16: 0x100
Profile Descriptor List:
  "AV Remote" (0x110e)
    Version: 0x0100

Service Name: AVRCP CT
Service RecHandle: 0x10007
Service Class ID List:
  "AV Remote" (0x110e)
Protocol Descriptor List:
  "L2CAP" (0x0100)
    PSM: 23
  "AVCTP" (0x0017)
    uint16: 0x100
Profile Descriptor List:
  "AV Remote" (0x110e)
    Version: 0x0100

From: Lan Zhu <zhu.lan.cn@...>
Subject: Re: null pointer error in bluez kernel
Newsgroups: gmane.linux.bluez.kernel
Date: 2009-10-13 02:13:18 GMT (14 weeks, 3 days, 7 hours and 50 minutes ago)

Hi Marcel,

>
> Reproduce steps:
> 1. Pair and connect with Motorola S305 headset.
> 2. Disconnect and unpair with the headset.
> 3. Turn off and then turn on the headset. The headset will auto pair with phone.
> 4. Input PIN code "0000" on the phone to complete the incoming pairing.
>
> Repeat step 2-4 for many times, then kernel panic may happen right
> after step 4.
>
> From the kernel log, I found if the bt_accept_unlink() is called
> before l2cap_conn_start(), then panic will happen because in the
> bt_accept_unlink() function it set parent to NULL.
>
> Below is the call order  when the result is successful. We can see the
> parent is not NULL.
>
> [  190.162475] bt_accept_enqueue: parent ccda5298, sk cdb68920
> [  190.170104] bt_accept_enqueue: parent ccda5d10, sk cdf5cd90
> [  190.191223] l2cap_conn_start: conn cd14a320
> [  190.218719] l2cap_conn_start: conn cd14a320
> [  190.223480] l2cap_conn_start: @@@ in l2cap_conn_start --- sk =
> cdb68920, parent = ccda5298
> [  190.235565] bt_accept_unlink: sk cdb68920 state 6
>
> Below is the call order when the result is kernel panic.
> bt_accept_unlink is called first, then we can see the parent is NULL.
>
> [  238.188812] bt_accept_enqueue: parent ccda5298, sk ccf60040
> [  238.196350] bt_accept_enqueue: parent ccda5d10, sk cdf5c960
> [  238.217590] l2cap_conn_start: conn cd14a848
> [  238.223449] bt_accept_unlink: sk ccf60040 state 6
> [  238.229400] l2cap_sock_accept: new socket ccf60040
> [  238.245086] l2cap_conn_start: conn cd14a848
> [  238.249725] l2cap_conn_start: @@@ in l2cap_conn_start --- sk =
> ccf60040, parent = (null)
> [  238.258636] Unable to handle kernel NULL pointer dereference at
> virtual address 00000120
> [  238.267456] pgd = cdb34000
> [  238.270446] [00000120] *pgd=8db32031, *pte=00000000, *ppte=00000000
> [  238.277740] Internal error: Oops: 17 [#1] PREEMPT
>

[-- Attachment #2: s305-panics-with-ubuntu.hcidump.log --]
[-- Type: application/octet-stream, Size: 10860 bytes --]

        Media Codec - SBC
          16kHz 32kHz 44.1kHz 48kHz
          Mono DualChannel Stereo JointStereo
          4 8 12 16 Blocks
          4 8 Subbands
          SNR Loudness
          Bitpool Range 2-53
        Content Protection
          02 00
2010-01-22 16:51:37.003109 < ACL data: handle 1 flags 0x02 dlen 7
    L2CAP(d): cid 0x0042 len 3 [psm 25]
      AVDTP(s): Suspend cmd: transaction 2
        ACP SEID 1
2010-01-22 16:51:37.016208 > HCI Event: Number of Completed Packets (0x13) plen 5
    handle 1 packets 1
2010-01-22 16:51:37.041213 > ACL data: handle 1 flags 0x02 dlen 6
    L2CAP(d): cid 0x0041 len 2 [psm 25]
      AVDTP(s): Suspend rsp: transaction 2
2010-01-22 16:51:38.001425 < ACL data: handle 1 flags 0x02 dlen 12
    L2CAP(s): Disconn req: dcid 0x0044 scid 0x0042
2010-01-22 16:51:38.001450 < ACL data: handle 1 flags 0x02 dlen 12
    L2CAP(s): Disconn req: dcid 0x0042 scid 0x0041
2010-01-22 16:51:38.021217 > HCI Event: Number of Completed Packets (0x13) plen 5
    handle 1 packets 1
2010-01-22 16:51:38.023207 > HCI Event: Number of Completed Packets (0x13) plen 5
    handle 1 packets 1
2010-01-22 16:51:38.077208 > ACL data: handle 1 flags 0x02 dlen 12
    L2CAP(s): Disconn rsp: dcid 0x0044 scid 0x0042
2010-01-22 16:51:38.105207 > ACL data: handle 1 flags 0x02 dlen 12
    L2CAP(s): Disconn rsp: dcid 0x0042 scid 0x0041
2010-01-22 16:51:42.331214 > ACL data: handle 1 flags 0x02 dlen 16
    L2CAP(s): Config req: dcid 0x0044 flags 0x00 clen 4
      MTU 895
2010-01-22 16:51:42.331225 < ACL data: handle 1 flags 0x02 dlen 10
    L2CAP(s): Command rej: reason 0
      Command not understood
2010-01-22 16:51:42.353207 > HCI Event: Number of Completed Packets (0x13) plen 5
    handle 1 packets 1
2010-01-22 16:51:55.515141 < ACL data: handle 1 flags 0x02 dlen 8
    L2CAP(d): cid 0x0041 len 4 [psm 3]
      RFCOMM(s): DISC: cr 0 dlci 26 pf 1 ilen 0 fcs 0x67
2010-01-22 16:51:55.525213 > HCI Event: Number of Completed Packets (0x13) plen 5
    handle 1 packets 1
2010-01-22 16:51:55.530210 > ACL data: handle 1 flags 0x02 dlen 8
    L2CAP(d): cid 0x0040 len 4 [psm 3]
      RFCOMM(s): UA: cr 0 dlci 26 pf 1 ilen 0 fcs 0x4d
2010-01-22 16:51:55.530234 < ACL data: handle 1 flags 0x02 dlen 8
    L2CAP(d): cid 0x0041 len 4 [psm 3]
      RFCOMM(s): DISC: cr 0 dlci 0 pf 1 ilen 0 fcs 0x9c
2010-01-22 16:51:55.538211 > HCI Event: Number of Completed Packets (0x13) plen 5
    handle 1 packets 1
2010-01-22 16:51:55.544209 > ACL data: handle 1 flags 0x02 dlen 8
    L2CAP(d): cid 0x0040 len 4 [psm 3]
      RFCOMM(s): UA: cr 0 dlci 0 pf 1 ilen 0 fcs 0xb6
2010-01-22 16:51:55.544228 < ACL data: handle 1 flags 0x02 dlen 12
    L2CAP(s): Disconn req: dcid 0x0041 scid 0x0040
2010-01-22 16:51:55.553209 > HCI Event: Number of Completed Packets (0x13) plen 5
    handle 1 packets 1
2010-01-22 16:51:55.567210 > ACL data: handle 1 flags 0x02 dlen 12
    L2CAP(s): Disconn rsp: dcid 0x0041 scid 0x0040
2010-01-22 16:51:58.000116 < HCI Command: Disconnect (0x01|0x0006) plen 3
    handle 1 reason 0x13
    Reason: Remote User Terminated Connection
2010-01-22 16:51:58.003209 > HCI Event: Command Status (0x0f) plen 4
    Disconnect (0x01|0x0006) status 0x00 ncmd 1
2010-01-22 16:51:58.100210 > HCI Event: Disconn Complete (0x05) plen 4
    status 0x00 handle 1 reason 0x16
    Reason: Connection Terminated by Local Host
2010-01-22 16:52:00.215810 < HCI Command: Delete Stored Link Key (0x03|0x0012) plen 7
    bdaddr 00:0D:FD:2D:36:E0 all 0
2010-01-22 16:52:00.219209 > HCI Event: Command Complete (0x0e) plen 6
    Delete Stored Link Key (0x03|0x0012) ncmd 1
    status 0x00 deleted 0
2010-01-22 16:52:18.123212 > HCI Event: Connect Request (0x04) plen 10
    bdaddr 00:0D:FD:2D:36:E0 class 0x240404 type ACL
2010-01-22 16:52:18.123229 < HCI Command: Accept Connection Request (0x01|0x0009) plen 7
    bdaddr 00:0D:FD:2D:36:E0 role 0x00
    Role: Master
2010-01-22 16:52:18.127211 > HCI Event: Command Status (0x0f) plen 4
    Accept Connection Request (0x01|0x0009) status 0x00 ncmd 1
2010-01-22 16:52:18.287213 > HCI Event: Role Change (0x12) plen 8
    status 0x00 bdaddr 00:0D:FD:2D:36:E0 role 0x00
    Role: Master
2010-01-22 16:52:18.295212 > HCI Event: Connect Complete (0x03) plen 11
    status 0x00 handle 1 bdaddr 00:0D:FD:2D:36:E0 type ACL encrypt 0x00
2010-01-22 16:52:18.295224 < HCI Command: Read Remote Supported Features (0x01|0x001b) plen 2
    handle 1
2010-01-22 16:52:18.298213 > HCI Event: Command Status (0x0f) plen 4
    Read Remote Supported Features (0x01|0x001b) status 0x00 ncmd 1
2010-01-22 16:52:18.298226 < HCI Command: Remote Name Request (0x01|0x0019) plen 10
    bdaddr 00:0D:FD:2D:36:E0 mode 2 clkoffset 0x0000
2010-01-22 16:52:18.301220 > HCI Event: Command Status (0x0f) plen 4
    Remote Name Request (0x01|0x0019) status 0x00 ncmd 1
2010-01-22 16:52:18.311225 > HCI Event: Max Slots Change (0x1b) plen 3
    handle 1 slots 5
2010-01-22 16:52:18.321219 > HCI Event: Connection Packet Type Changed (0x1d) plen 5
    status 0x00 handle 1 ptype 0x660c
    Packet type: DM1 DM3 DM5 3-DH1 3-DH3 3-DH5
2010-01-22 16:52:18.358213 > HCI Event: Read Remote Supported Features (0x0b) plen 11
    status 0x00 handle 1
    Features: 0xff 0xfe 0x0f 0x86 0x8b 0xa9 0x00 0x00
2010-01-22 16:52:18.377211 > HCI Event: Remote Name Req Complete (0x07) plen 255
    status 0x00 bdaddr 00:0D:FD:2D:36:E0 name 'Motorola S305'
2010-01-22 16:52:18.405212 > HCI Event: Link Key Request (0x17) plen 6
    bdaddr 00:0D:FD:2D:36:E0
2010-01-22 16:52:18.405342 < HCI Command: Link Key Request Negative Reply (0x01|0x000c) plen 6
    bdaddr 00:0D:FD:2D:36:E0
2010-01-22 16:52:18.409211 > HCI Event: Command Complete (0x0e) plen 10
    Link Key Request Negative Reply (0x01|0x000c) ncmd 1
    status 0x00 bdaddr 00:0D:FD:2D:36:E0
2010-01-22 16:52:18.410211 > HCI Event: PIN Code Request (0x16) plen 6
    bdaddr 00:0D:FD:2D:36:E0
2010-01-22 16:52:18.852212 > HCI Event: Disconn Complete (0x05) plen 4
    status 0x00 handle 1 reason 0x13
    Reason: Remote User Terminated Connection
2010-01-22 16:52:24.520215 > HCI Event: Connect Request (0x04) plen 10
    bdaddr 00:0D:FD:2D:36:E0 class 0x240404 type ACL
2010-01-22 16:52:24.520233 < HCI Command: Accept Connection Request (0x01|0x0009) plen 7
    bdaddr 00:0D:FD:2D:36:E0 role 0x00
    Role: Master
2010-01-22 16:52:24.524211 > HCI Event: Command Status (0x0f) plen 4
    Accept Connection Request (0x01|0x0009) status 0x00 ncmd 1
2010-01-22 16:52:24.685213 > HCI Event: Role Change (0x12) plen 8
    status 0x00 bdaddr 00:0D:FD:2D:36:E0 role 0x00
    Role: Master
2010-01-22 16:52:24.692212 > HCI Event: Connect Complete (0x03) plen 11
    status 0x00 handle 1 bdaddr 00:0D:FD:2D:36:E0 type ACL encrypt 0x00
2010-01-22 16:52:24.692225 < HCI Command: Read Remote Supported Features (0x01|0x001b) plen 2
    handle 1
2010-01-22 16:52:24.695214 > HCI Event: Command Status (0x0f) plen 4
    Read Remote Supported Features (0x01|0x001b) status 0x00 ncmd 1
2010-01-22 16:52:24.695228 < HCI Command: Remote Name Request (0x01|0x0019) plen 10
    bdaddr 00:0D:FD:2D:36:E0 mode 2 clkoffset 0x0000
2010-01-22 16:52:24.698214 > HCI Event: Command Status (0x0f) plen 4
    Remote Name Request (0x01|0x0019) status 0x00 ncmd 1
2010-01-22 16:52:24.709219 > HCI Event: Max Slots Change (0x1b) plen 3
    handle 1 slots 5
2010-01-22 16:52:24.719214 > HCI Event: Connection Packet Type Changed (0x1d) plen 5
    status 0x00 handle 1 ptype 0x660c
    Packet type: DM1 DM3 DM5 3-DH1 3-DH3 3-DH5
2010-01-22 16:52:24.755213 > HCI Event: Read Remote Supported Features (0x0b) plen 11
    status 0x00 handle 1
    Features: 0xff 0xfe 0x0f 0x86 0x8b 0xa9 0x00 0x00
2010-01-22 16:52:24.774211 > HCI Event: Remote Name Req Complete (0x07) plen 255
    status 0x00 bdaddr 00:0D:FD:2D:36:E0 name 'Motorola S305'
2010-01-22 16:52:24.803212 > HCI Event: Link Key Request (0x17) plen 6
    bdaddr 00:0D:FD:2D:36:E0
2010-01-22 16:52:24.803341 < HCI Command: Link Key Request Negative Reply (0x01|0x000c) plen 6
    bdaddr 00:0D:FD:2D:36:E0
2010-01-22 16:52:24.807211 > HCI Event: Command Complete (0x0e) plen 10
    Link Key Request Negative Reply (0x01|0x000c) ncmd 1
    status 0x00 bdaddr 00:0D:FD:2D:36:E0
2010-01-22 16:52:24.808212 > HCI Event: PIN Code Request (0x16) plen 6
    bdaddr 00:0D:FD:2D:36:E0
2010-01-22 16:52:31.531128 < HCI Command: PIN Code Request Reply (0x01|0x000d) plen 23
    bdaddr 00:0D:FD:2D:36:E0 len 4 pin '0000'
2010-01-22 16:52:31.534213 > HCI Event: Command Complete (0x0e) plen 10
    PIN Code Request Reply (0x01|0x000d) ncmd 1
    status 0x00 bdaddr 00:0D:FD:2D:36:E0
2010-01-22 16:52:31.794212 > HCI Event: Link Key Notification (0x18) plen 23
    bdaddr 00:0D:FD:2D:36:E0 key A70C5E7EC19CA400E556E121626B35B6 type 0
    Type: Combination Key
2010-01-22 16:52:31.916215 > HCI Event: Encrypt Change (0x08) plen 4
    status 0x00 handle 1 encrypt 0x01
2010-01-22 16:52:31.918214 > ACL data: handle 1 flags 0x02 dlen 12
    L2CAP(s): Connect req: psm 25 scid 0x0041
2010-01-22 16:52:31.918228 < ACL data: handle 1 flags 0x02 dlen 16
    L2CAP(s): Connect rsp: dcid 0x0040 scid 0x0041 result 1 status 0
      Connection pending - No futher information available
2010-01-22 16:52:31.918232 < ACL data: handle 1 flags 0x02 dlen 10
    L2CAP(s): Info req: type 2
2010-01-22 16:52:31.922213 > ACL data: handle 1 flags 0x02 dlen 12
    L2CAP(s): Connect req: psm 1 scid 0x0042
2010-01-22 16:52:31.922224 < ACL data: handle 1 flags 0x02 dlen 16
    L2CAP(s): Connect rsp: dcid 0x0041 scid 0x0042 result 1 status 0
      Connection pending - No futher information available
2010-01-22 16:52:31.922227 < ACL data: handle 1 flags 0x02 dlen 10
    L2CAP(s): Info req: type 2
2010-01-22 16:52:31.923213 > HCI Event: Number of Completed Packets (0x13) plen 5
    handle 1 packets 1
2010-01-22 16:52:31.924213 > HCI Event: Number of Completed Packets (0x13) plen 5
    handle 1 packets 1
2010-01-22 16:52:31.926214 > HCI Event: Number of Completed Packets (0x13) plen 5
    handle 1 packets 1
2010-01-22 16:52:31.927213 > HCI Event: Number of Completed Packets (0x13) plen 5
    handle 1 packets 1
2010-01-22 16:52:31.964213 > ACL data: handle 1 flags 0x02 dlen 16
    L2CAP(s): Info rsp: type 2 result 0
      Extended feature mask 0x0000
2010-01-22 16:52:31.964222 < ACL data: handle 1 flags 0x02 dlen 16
    L2CAP(s): Connect rsp: dcid 0x0041 scid 0x0042 result 0 status 0
      Connection successful
2010-01-22 16:52:31.964226 < ACL data: handle 1 flags 0x02 dlen 16
    L2CAP(s): Connect rsp: dcid 0x0040 scid 0x0041 result 1 status 1
      Connection pending - Authentication pending
2010-01-22 16:52:31.964230 < HCI Command: Authentication Requested (0x01|0x0011) plen 2
    handle 1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Kernel panic when handing Motorola S305 headset
       [not found] ` <6aeb672b1001272156m678aa8fepec2498386947936a@mail.gmail.com>
@ 2010-03-01  9:54   ` Liang Bao
  2010-03-15 17:27     ` jaikumar Ganesh
  2010-04-08  0:15     ` Nick Pelly
  0 siblings, 2 replies; 4+ messages in thread
From: Liang Bao @ 2010-03-01  9:54 UTC (permalink / raw)
  To: linux-bluetooth

[-- Attachment #1: Type: text/plain, Size: 2206 bytes --]

I'd like to continue the previous thread on that Motorola S305 causes
kernel panic because I did find some clue here. Sorry for misleading
guess one month ago if any.

Recap the problem here so that you don't to read the first long post.
The pattern to reproduce the issue is:
1. Pair the S305 headset from the phone or the PC( I am using a Ubuntu)
2. Remove pairing on the phone or PC
3. Power off and then power on S305.
4. The S305 will try to connect and since link key removed on this
side it will try to pair. Input 0000.
5. Kernel panic happens. This can be observed on kernel version
2.6.29(on the Droid phone, yes, it's a modified version),
2.6.31-19-generic on a Ubuntu and a pretty latest 2.6.33-020633rc8
from Ubuntu official RC release.

The exact kernel crash point is
             if (l2cap_check_security(sk)) {
                  if (bt_sk(sk)->defer_setup) {
                      struct sock *parent = bt_sk(sk)->parent;
                      rsp.result = cpu_to_le16(L2CAP_CR_PEND);
                      rsp.status = cpu_to_le16(L2CAP_CS_AUTHOR_PEND);
>>>                   parent->sk_data_ready(parent, 0)
                  } else {

After tracing the issue for a couple of weeks, I find the difference
between a normal flow and the panic one. If the user space process
accepts the L2CAP connection request before L2CAP_INFO_RSP received,
the following calls will be carried out:

l2cap_sock_accept-> bt_accept_dequeue->bt_accept_unlink(in the branch
bt_sk(parent)->defer_setup)-> set bt_sk(sk)->parent = NULL. Later when
L2CAP_INFO_RSP arrives, the l2cap_conn_start() will try to call the
marked line above and de-referring NULL happen.

To fix this, shall we consider checking if a pending socket can be
accepted in bt_accept_dequeue() prior to a pending L2CAP_INFO_REQ
responded? For example,  adding a check to BT_CONNECT2 in
af_bluetooth.c.

215         if (sk->sk_state == BT_CONNECTED || !newsock ||
216                         ( bt_sk(parent)->defer_setup &&
(sk->sk_state != BT_CONNECT2))) {

Again, I am not sure if this will bring a side-effect. Please advise
the most appropriate way. Thanks.

p.s: I attached partial trace files for those who're interested to the traces.

[-- Attachment #2: s305-good-bad-trace.txt --]
[-- Type: text/plain, Size: 1891 bytes --]

Trace when panic happended:
<3>[  356.758483] l2cap_connect_req: psm 0x19 scid 0x0041
<3>[  356.763854] l2cap_connect_req: parent c7e8f600
<3>[  356.768829] __l2cap_chan_add: conn c7fa37c0, psm 0x19, dcid 0x0041
<3>[  356.775482] l2cap_connect_req: before return, sk c5d66c00 parent c7e8f600
<3>[  356.782897] l2cap_connect_req: psm 0x01 scid 0x0042
<3>[  356.788177] l2cap_connect_req: parent c7e8fe00
<3>[  356.793121] __l2cap_chan_add: conn c7fa37c0, psm 0x01, dcid 0x0042
<3>[  356.799774] l2cap_connect_req: before return, sk c5d66a00 parent c7e8fe00
<3>[  356.837158] l2cap_sock_accept: new socket c5d66c00
<3>[  356.842590] l2cap_conn_start: BT_CONNECTE2, sk c5d66c00, parent (null)

Trace when everything goes on well.
<3>[  641.393463] l2cap_connect_req: psm 0x19 scid 0x0041
<3>[  641.398956] l2cap_connect_req: parent c7e91200
<3>[  641.403778] __l2cap_chan_add: conn c9519440, psm 0x19, dcid 0x0041
<3>[  641.410461] l2cap_connect_req: before return, sk c97e3a00 parent c7e91200
<3>[  641.417877] l2cap_connect_req: psm 0x01 scid 0x0042
<3>[  641.423156] l2cap_connect_req: parent c64b0c00
<3>[  641.428131] __l2cap_chan_add: conn c9519440, psm 0x01, dcid 0x0042
<3>[  641.434753] l2cap_connect_req: before return, sk c97e3400 parent c64b0c00
<3>[  641.476165] l2cap_conn_start: BT_CONNECTE2, sk c97e3a00, parent c7e91200
<3>[  641.483856] l2cap_sock_accept: new socket c97e3a00
<3>[  641.551513] l2cap_sock_accept: new socket c97e3400
<3>[  641.592742] l2cap_sock_shutdown: sock cea7a840, sk c97e3a00, how 2
<3>[  641.607727] l2cap_sock_shutdown: sock cea7a0c0, sk c97e3400, how 2
<3>[  644.055419] __l2cap_chan_add: conn c9519440, psm 0x01, dcid 0x0000
<3>[  650.055816] l2cap_sock_shutdown: sock cea7a0c0, sk c97e3400, how 2
<3>[  650.063018] l2cap_sock_shutdown: sk-shutdown = true
<3>[  650.069427] __l2cap_sock_close: sk c97e3400 state 1 socket cea7a0c0 reason 0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Kernel panic when handing Motorola S305 headset
  2010-03-01  9:54   ` Liang Bao
@ 2010-03-15 17:27     ` jaikumar Ganesh
  2010-04-08  0:15     ` Nick Pelly
  1 sibling, 0 replies; 4+ messages in thread
From: jaikumar Ganesh @ 2010-03-15 17:27 UTC (permalink / raw)
  To: Liang Bao, Marcel Holtmann, padovan; +Cc: linux-bluetooth


[-- Attachment #1.1: Type: text/plain, Size: 2440 bytes --]

Hi Marcel / Gustavo:

On Mon, Mar 1, 2010 at 2:54 AM, Liang Bao <tim.bao@gmail.com> wrote:

> I'd like to continue the previous thread on that Motorola S305 causes
> kernel panic because I did find some clue here. Sorry for misleading
> guess one month ago if any.
>
> Recap the problem here so that you don't to read the first long post.
> The pattern to reproduce the issue is:
> 1. Pair the S305 headset from the phone or the PC( I am using a Ubuntu)
> 2. Remove pairing on the phone or PC
> 3. Power off and then power on S305.
> 4. The S305 will try to connect and since link key removed on this
> side it will try to pair. Input 0000.
> 5. Kernel panic happens. This can be observed on kernel version
> 2.6.29(on the Droid phone, yes, it's a modified version),
> 2.6.31-19-generic on a Ubuntu and a pretty latest 2.6.33-020633rc8
> from Ubuntu official RC release.
>
> The exact kernel crash point is
>             if (l2cap_check_security(sk)) {
>                   if (bt_sk(sk)->defer_setup) {
>                      struct sock *parent = bt_sk(sk)->parent;
>                      rsp.result = cpu_to_le16(L2CAP_CR_PEND);
>                      rsp.status = cpu_to_le16(L2CAP_CS_AUTHOR_PEND);
> >>>                   parent->sk_data_ready(parent, 0)
>                   } else {
>
> After tracing the issue for a couple of weeks, I find the difference
> between a normal flow and the panic one. If the user space process
> accepts the L2CAP connection request before L2CAP_INFO_RSP received,
> the following calls will be carried out:
>
> l2cap_sock_accept-> bt_accept_dequeue->bt_accept_unlink(in the branch
> bt_sk(parent)->defer_setup)-> set bt_sk(sk)->parent = NULL. Later when
> L2CAP_INFO_RSP arrives, the l2cap_conn_start() will try to call the
> marked line above and de-referring NULL happen.
>
> To fix this, shall we consider checking if a pending socket can be
> accepted in bt_accept_dequeue() prior to a pending L2CAP_INFO_REQ
> responded? For example,  adding a check to BT_CONNECT2 in
> af_bluetooth.c.
>
> 215         if (sk->sk_state == BT_CONNECTED || !newsock ||
> 216                         ( bt_sk(parent)->defer_setup &&
> (sk->sk_state != BT_CONNECT2))) {
>
> Again, I am not sure if this will bring a side-effect. Please advise
> the most appropriate way. Thanks.
>
> p.s: I attached partial trace files for those who're interested to the
> traces.
>


Can you ACK or NACK the analysis ?

Thanks
Jaikumar

[-- Attachment #1.2: Type: text/html, Size: 2997 bytes --]

[-- Attachment #2: s305-good-bad-trace.txt --]
[-- Type: text/plain, Size: 1891 bytes --]

Trace when panic happended:
<3>[  356.758483] l2cap_connect_req: psm 0x19 scid 0x0041
<3>[  356.763854] l2cap_connect_req: parent c7e8f600
<3>[  356.768829] __l2cap_chan_add: conn c7fa37c0, psm 0x19, dcid 0x0041
<3>[  356.775482] l2cap_connect_req: before return, sk c5d66c00 parent c7e8f600
<3>[  356.782897] l2cap_connect_req: psm 0x01 scid 0x0042
<3>[  356.788177] l2cap_connect_req: parent c7e8fe00
<3>[  356.793121] __l2cap_chan_add: conn c7fa37c0, psm 0x01, dcid 0x0042
<3>[  356.799774] l2cap_connect_req: before return, sk c5d66a00 parent c7e8fe00
<3>[  356.837158] l2cap_sock_accept: new socket c5d66c00
<3>[  356.842590] l2cap_conn_start: BT_CONNECTE2, sk c5d66c00, parent (null)

Trace when everything goes on well.
<3>[  641.393463] l2cap_connect_req: psm 0x19 scid 0x0041
<3>[  641.398956] l2cap_connect_req: parent c7e91200
<3>[  641.403778] __l2cap_chan_add: conn c9519440, psm 0x19, dcid 0x0041
<3>[  641.410461] l2cap_connect_req: before return, sk c97e3a00 parent c7e91200
<3>[  641.417877] l2cap_connect_req: psm 0x01 scid 0x0042
<3>[  641.423156] l2cap_connect_req: parent c64b0c00
<3>[  641.428131] __l2cap_chan_add: conn c9519440, psm 0x01, dcid 0x0042
<3>[  641.434753] l2cap_connect_req: before return, sk c97e3400 parent c64b0c00
<3>[  641.476165] l2cap_conn_start: BT_CONNECTE2, sk c97e3a00, parent c7e91200
<3>[  641.483856] l2cap_sock_accept: new socket c97e3a00
<3>[  641.551513] l2cap_sock_accept: new socket c97e3400
<3>[  641.592742] l2cap_sock_shutdown: sock cea7a840, sk c97e3a00, how 2
<3>[  641.607727] l2cap_sock_shutdown: sock cea7a0c0, sk c97e3400, how 2
<3>[  644.055419] __l2cap_chan_add: conn c9519440, psm 0x01, dcid 0x0000
<3>[  650.055816] l2cap_sock_shutdown: sock cea7a0c0, sk c97e3400, how 2
<3>[  650.063018] l2cap_sock_shutdown: sk-shutdown = true
<3>[  650.069427] __l2cap_sock_close: sk c97e3400 state 1 socket cea7a0c0 reason 0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Kernel panic when handing Motorola S305 headset
  2010-03-01  9:54   ` Liang Bao
  2010-03-15 17:27     ` jaikumar Ganesh
@ 2010-04-08  0:15     ` Nick Pelly
  1 sibling, 0 replies; 4+ messages in thread
From: Nick Pelly @ 2010-04-08  0:15 UTC (permalink / raw)
  To: Liang Bao; +Cc: linux-bluetooth, Marcel Holtmann

On Mon, Mar 1, 2010 at 1:54 AM, Liang Bao <tim.bao@gmail.com> wrote:
> I'd like to continue the previous thread on that Motorola S305 causes
> kernel panic because I did find some clue here. Sorry for misleading
> guess one month ago if any.
>
> Recap the problem here so that you don't to read the first long post.
> The pattern to reproduce the issue is:
> 1. Pair the S305 headset from the phone or the PC( I am using a Ubuntu)
> 2. Remove pairing on the phone or PC
> 3. Power off and then power on S305.
> 4. The S305 will try to connect and since link key removed on this
> side it will try to pair. Input 0000.
> 5. Kernel panic happens. This can be observed on kernel version
> 2.6.29(on the Droid phone, yes, it's a modified version),
> 2.6.31-19-generic on a Ubuntu and a pretty latest 2.6.33-020633rc8
> from Ubuntu official RC release.
>
> The exact kernel crash point is
> =A0 =A0 =A0 =A0 =A0 =A0 if (l2cap_check_security(sk)) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if (bt_sk(sk)->defer_setup) {
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0struct sock *parent =3D bt_sk(=
sk)->parent;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0rsp.result =3D cpu_to_le16(L2C=
AP_CR_PEND);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0rsp.status =3D cpu_to_le16(L2C=
AP_CS_AUTHOR_PEND);
>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 parent->sk_data_ready(parent, 0)
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0} else {
>
> After tracing the issue for a couple of weeks, I find the difference
> between a normal flow and the panic one. If the user space process
> accepts the L2CAP connection request before L2CAP_INFO_RSP received,
> the following calls will be carried out:
>
> l2cap_sock_accept-> bt_accept_dequeue->bt_accept_unlink(in the branch
> bt_sk(parent)->defer_setup)-> set bt_sk(sk)->parent =3D NULL. Later when
> L2CAP_INFO_RSP arrives, the l2cap_conn_start() will try to call the
> marked line above and de-referring NULL happen.
>
> To fix this, shall we consider checking if a pending socket can be
> accepted in bt_accept_dequeue() prior to a pending L2CAP_INFO_REQ
> responded? For example, =A0adding a check to BT_CONNECT2 in
> af_bluetooth.c.
>
> 215 =A0 =A0 =A0 =A0 if (sk->sk_state =3D=3D BT_CONNECTED || !newsock ||
> 216 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ( bt_sk(parent)->defe=
r_setup &&
> (sk->sk_state !=3D BT_CONNECT2))) {
>
> Again, I am not sure if this will bring a side-effect. Please advise
> the most appropriate way. Thanks.
>
> p.s: I attached partial trace files for those who're interested to the tr=
aces.
>

We can reproduce this issue.

There is nothing preventing an l2cap socket with deferred setup from
accepting an l2cap connection before the info response packet has come
in. This causes the null pointer panic when the info response
eventually arrives.

I'm not sure the best way to fix this. Ideally we'd check
L2CAP_INFO_FEAT_MASK_REQ_DONE, but that is not available in
af_bluetooth.c:bt_accept_dequeue(). I think the problem is that
BT_CONNECT2 - which is available in af_bluetooth.c is used for both
deferred setup and for the case where we are waiting for the info
response.

Marcel, some advice on the best way to proceed here would be helpful.

Nick

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-04-08  0:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-22 10:12 Kernel panic when handing Motorola S305 headset Liang Bao
     [not found] ` <6aeb672b1001272156m678aa8fepec2498386947936a@mail.gmail.com>
2010-03-01  9:54   ` Liang Bao
2010-03-15 17:27     ` jaikumar Ganesh
2010-04-08  0:15     ` Nick Pelly

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.