* Deadlock in bluetooth/sco.c @ 2009-05-03 20:46 Jan Kucera 2009-05-04 1:17 ` Marcel Holtmann 0 siblings, 1 reply; 7+ messages in thread From: Jan Kucera @ 2009-05-03 20:46 UTC (permalink / raw) To: marcel; +Cc: linux-bluetooth Hi, I've found some possible deadlock in net/bluetooth/sco.c - version 2.6.28 (probably this code is in newer versions too). Could someone confirm this? Thank you. net/bluetooth/sco.c ============== function sco_conn_ready: (conn <- sk) ------------------------------------- lockig sco_conn_lock(conn) at line 796 bh_lock_sock(sk) at line 800 function sco_conn_del: (sk <- conn) --------------------------------- bh_lock_sock(sk); at 154 calling function sco_chan_del(sk, err); at line 156 where at line 767 is sco_conn_lock(conn); caught by Stanse http://iti.fi.muni.cz/stanse/ ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Deadlock in bluetooth/sco.c 2009-05-03 20:46 Deadlock in bluetooth/sco.c Jan Kucera @ 2009-05-04 1:17 ` Marcel Holtmann [not found] ` <9F0C1DB20AFA954FA1DA05309350433D5F913D45@pdsmsx503.ccr.corp.intel.com> 0 siblings, 1 reply; 7+ messages in thread From: Marcel Holtmann @ 2009-05-04 1:17 UTC (permalink / raw) To: Jan Kucera; +Cc: linux-bluetooth Hi Jan, > I've found some possible deadlock in net/bluetooth/sco.c - version > 2.6.28 (probably this code is in newer versions too). > Could someone confirm this? Thank you. > > > net/bluetooth/sco.c > ============== > > function sco_conn_ready: (conn <- sk) > ------------------------------------- > lockig sco_conn_lock(conn) at line 796 > bh_lock_sock(sk) at line 800 > > function sco_conn_del: (sk <- conn) > --------------------------------- > bh_lock_sock(sk); at 154 > calling function sco_chan_del(sk, err); at line 156 > where at line 767 is sco_conn_lock(conn); can you please re-test with the bluetooth-testing.git tree so we can verify that this issue still exists. Regards Marcel ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <9F0C1DB20AFA954FA1DA05309350433D5F913D45@pdsmsx503.ccr.corp.intel.com>]
* kernel carsh using Bluez on Netbook platform [not found] ` <9F0C1DB20AFA954FA1DA05309350433D5F913D45@pdsmsx503.ccr.corp.intel.com> @ 2009-05-05 8:06 ` Xu, Martin 2009-05-05 8:06 ` Xu, Martin 1 sibling, 0 replies; 7+ messages in thread From: Xu, Martin @ 2009-05-05 8:06 UTC (permalink / raw) To: linux-bluetooth; +Cc: Liu, Bing Wei Hi: On netbook platform( Eeepc 901; "Aspire One + Omiz Bluetooth dongle"), when using bluez, such as paring, l2ping and rfcomm, kernel crashes easily. I am using kernel 2.6.29. I caught below crash messag: BUG: spinlock bad magic on CPU#0, swapper/0 Bug: unable to handle kernel paging request at 00646733 IP:[<c0508736>] spin_bug+0x5a/0x87 *pdpt = 0000000000a1b001 *pde = 0000000000000000 Oops: 0000 [#1] PREEMPT smp last sysfs file: /sys/devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1:1.0/bluetooth/hci0/hci0/hci0:42/type ... EIP is at spin_bug+0x5a/0x87 ... Call Trace: [c......]? _raw_spin_lock+0x1e/ox11c [c......]? spin_unlock_irqrestore+0x22/0x25 [c......]?_spin_lock_irqsave+0x17/0x1c [c......]skb_dequeue+0-x2a/0x94 [c......]skb_queue_purge+0x14/0x1b [c......]hci_conn_del+0x10e/0x115 [c......]hci_event_packet+0x620/0x29b7 [c......]enqueue_task_fair+0xxxx/0xxxx [c......]_spin_unlock_irqresotre+0xxxx/0xxxx [c......]try_to_wake_up+0xxxx/0xxxx [c......]default_wake_function+0xxxx/0xxxx [c......]pollwake+0xxxx/0xxxx [c......]default_wake_function+0xxxx/0xxxx [c......]wake_up_common+0xxxx/0xxxx [c......]spin_unlock_irqrestore+0xxxx/0xxxx [c......]__wake_up_sync+0xxxx/0xxxx [c......]_read_unlock+0xxxx/0xxxx [c......]sock_def_readable+0xxxx/0xxxx [c......]sock_queue_rcv_skb+0xxxx/0xxxx [c......]_read_unlock+0xxxx/0xxxx [c......]hci_send_to_socket+0xxxx/0xxxx [c......]hci_rx_task+0xxxx/0xxxx [c......]tasklet_action+0xxxx/0xxxx [c......]__do_softirq+0xxxx/0xxxx [c......]do_softirq+0xxxx/0xxxx [c......]irq_exit+0xxxx/0xxxx [c......]do_IRQ+0xxxx/0xxxx [c......]connmon_interrupt+0xxxx/0xxxx [c......]acpi_idle_enter_bm+0xxxx/0xxxx [c......]cpuidle_dile_call+0xxxx/0xxxx [c......]cpu_idle+0xxxx/0xxxx [c......]rest_init+0xxxx/0xxxx Also see moblin bug: https://bugzilla.moblin.org/show_bug.cgi?id=1919 Executing l2ping from netbook caused the system hang https://bugzilla.moblin.org/show_bug.cgi?id=2006 The user can't use bluetooth function sometimes https://bugzilla.moblin.org/show_bug.cgi?id=1543 A issue in rfcomm connection ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: kernel carsh using Bluez on Netbook platform [not found] ` <9F0C1DB20AFA954FA1DA05309350433D5F913D45@pdsmsx503.ccr.corp.intel.com> 2009-05-05 8:06 ` kernel carsh using Bluez on Netbook platform Xu, Martin @ 2009-05-05 8:06 ` Xu, Martin 2009-05-05 15:43 ` Marcel Holtmann 1 sibling, 1 reply; 7+ messages in thread From: Xu, Martin @ 2009-05-05 8:06 UTC (permalink / raw) To: linux-bluetooth; +Cc: Liu, Bing Wei >On netbook platform( Eeepc 901; "Aspire One + Omiz Bluetooth dongle"), when using >bluez, such as paring, l2ping and rfcomm, kernel crashes easily. >I am using kernel 2.6.29. >I caught the crash messag: >BUG: spinlock bad magic on CPU#0, swapper/0 >Bug: unable to handle kernel paging request at 00646733 I have done some research on the issue and found that at hci_event.c: hci_disconn_complete_evt() After hci_conn_del_sysfs(conn) The contents of conn maybe modified Such as conn->idle_timer conn->disc_timer and conn->list that leads to crash of kernel when run hci_conn_del(conn) I worked a patch to run hci_conn_del_sysfs after hci_conn_del and find that the issue can be fixed. Some one can tell me whether the patch is ok, and the root cause of the issue. Thanks! :) diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c index f91ba69..1999ac1 100644 --- a/net/bluetooth/hci_event.c +++ b/net/bluetooth/hci_event.c @@ -1009,10 +1009,9 @@ static inline void hci_disconn_complete_evt(struct hci_dev *hdev, struct sk_buff if (conn) { conn->state = BT_CLOSED; - hci_conn_del_sysfs(conn); - hci_proto_disconn_ind(conn, ev->reason); hci_conn_del(conn); + hci_conn_del_sysfs(conn); } hci_dev_unlock(hdev); ^ permalink raw reply related [flat|nested] 7+ messages in thread
* RE: kernel carsh using Bluez on Netbook platform 2009-05-05 8:06 ` Xu, Martin @ 2009-05-05 15:43 ` Marcel Holtmann 2009-05-05 16:08 ` Marcel Holtmann 0 siblings, 1 reply; 7+ messages in thread From: Marcel Holtmann @ 2009-05-05 15:43 UTC (permalink / raw) To: Xu, Martin; +Cc: linux-bluetooth, Liu, Bing Wei Hi Martin, > >On netbook platform( Eeepc 901; "Aspire One + Omiz Bluetooth dongle"), when using >bluez, such as paring, l2ping and rfcomm, kernel crashes easily. > >I am using kernel 2.6.29. > > >I caught the crash messag: > >BUG: spinlock bad magic on CPU#0, swapper/0 > >Bug: unable to handle kernel paging request at 00646733 > > I have done some research on the issue and found that at > hci_event.c: hci_disconn_complete_evt() > After > hci_conn_del_sysfs(conn) > The contents of conn maybe modified > Such as > conn->idle_timer > conn->disc_timer > and > conn->list > that leads to crash of kernel when run hci_conn_del(conn) > > I worked a patch to run hci_conn_del_sysfs after hci_conn_del and find that the issue can be fixed. Some one can tell me whether the patch is ok, and the root cause of the issue. Thanks! :) > > diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c > index f91ba69..1999ac1 100644 > --- a/net/bluetooth/hci_event.c > +++ b/net/bluetooth/hci_event.c > @@ -1009,10 +1009,9 @@ static inline void hci_disconn_complete_evt(struct > hci_dev *hdev, struct sk_buff > if (conn) { > conn->state = BT_CLOSED; > > - hci_conn_del_sysfs(conn); > - > hci_proto_disconn_ind(conn, ev->reason); > hci_conn_del(conn); > + hci_conn_del_sysfs(conn); > } > > hci_dev_unlock(hdev); can you verify that a bluetooth-testing.git kernel would still procude this NULL pointer dereference. It looks a little bit different, but I think that actually got fixed now. Regards Marcel ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: kernel carsh using Bluez on Netbook platform 2009-05-05 15:43 ` Marcel Holtmann @ 2009-05-05 16:08 ` Marcel Holtmann 2009-05-06 2:36 ` Xu, Martin 0 siblings, 1 reply; 7+ messages in thread From: Marcel Holtmann @ 2009-05-05 16:08 UTC (permalink / raw) To: Xu, Martin; +Cc: linux-bluetooth, Liu, Bing Wei Hi Martin, > > >On netbook platform( Eeepc 901; "Aspire One + Omiz Bluetooth dongle"), when using >bluez, such as paring, l2ping and rfcomm, kernel crashes easily. > > >I am using kernel 2.6.29. > > > > >I caught the crash messag: > > >BUG: spinlock bad magic on CPU#0, swapper/0 > > >Bug: unable to handle kernel paging request at 00646733 > > > > I have done some research on the issue and found that at > > hci_event.c: hci_disconn_complete_evt() > > After > > hci_conn_del_sysfs(conn) > > The contents of conn maybe modified > > Such as > > conn->idle_timer > > conn->disc_timer > > and > > conn->list > > that leads to crash of kernel when run hci_conn_del(conn) > > > > I worked a patch to run hci_conn_del_sysfs after hci_conn_del and find that the issue can be fixed. Some one can tell me whether the patch is ok, and the root cause of the issue. Thanks! :) > > > > diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c > > index f91ba69..1999ac1 100644 > > --- a/net/bluetooth/hci_event.c > > +++ b/net/bluetooth/hci_event.c > > @@ -1009,10 +1009,9 @@ static inline void hci_disconn_complete_evt(struct > > hci_dev *hdev, struct sk_buff > > if (conn) { > > conn->state = BT_CLOSED; > > > > - hci_conn_del_sysfs(conn); > > - > > hci_proto_disconn_ind(conn, ev->reason); > > hci_conn_del(conn); > > + hci_conn_del_sysfs(conn); > > } > > > > hci_dev_unlock(hdev); > > can you verify that a bluetooth-testing.git kernel would still procude > this NULL pointer dereference. It looks a little bit different, but I > think that actually got fixed now. I just double-checked the kernel patches and since you are still running a 2.6.29 kernel you might be missing this patch: Bluetooth: Move hci_conn_del_sysfs() back to avoid device destruct too early @@ -287,6 +287,8 @@ int hci_conn_del(struct hci_conn *conn) skb_queue_purge(&conn->data_q); + hci_conn_del_sysfs(conn); + return 0; } @@ -560,8 +562,6 @@ void hci_conn_hash_flush(struct hci_dev *hdev) c->state = BT_CLOSED; - hci_conn_del_sysfs(c); - hci_proto_disconn_cfm(c, 0x16); hci_conn_del(c); } The code got a lot of changes when adding Simple Pairing support and thus you might need a special patch if you wanna keep using 2.6.29. I would still advise you to check with bluetooth-testing.git first and if that works, then just backport all of the Bluetooth patches. The Fedora kernel contains two patches for the backport already and the missing ones can be added easily on top of it. Regards Marcel ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: kernel carsh using Bluez on Netbook platform 2009-05-05 16:08 ` Marcel Holtmann @ 2009-05-06 2:36 ` Xu, Martin 0 siblings, 0 replies; 7+ messages in thread From: Xu, Martin @ 2009-05-06 2:36 UTC (permalink / raw) To: Marcel Holtmann; +Cc: linux-bluetooth Marcel: Thank you very much, that really helpful! ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-05-06 2:36 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-05-03 20:46 Deadlock in bluetooth/sco.c Jan Kucera 2009-05-04 1:17 ` Marcel Holtmann [not found] ` <9F0C1DB20AFA954FA1DA05309350433D5F913D45@pdsmsx503.ccr.corp.intel.com> 2009-05-05 8:06 ` kernel carsh using Bluez on Netbook platform Xu, Martin 2009-05-05 8:06 ` Xu, Martin 2009-05-05 15:43 ` Marcel Holtmann 2009-05-05 16:08 ` Marcel Holtmann 2009-05-06 2:36 ` Xu, Martin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.