Hi Marcel, On 04/04/17 21:36, Dean Jenkins wrote: > > > On 03/04/17 16:51, Marcel Holtmann wrote: >> >> If this is an issue in 4.10, then lets get this fixed / hardened. >> > > If I manage to produce some more useful results then I will post them. > I have now managed to crash the h4 Data Link protocol layer via hci_uart_tty_close(). This confirms that there is a design flaw in hci_uart_tty_close() which is independent of the Bluetooth Data Link protocol layers. I don't have a Bluetooth Radio Module that uses h4 protocol so I used my BCSP enabled Bluetooth Radio Module that has a USB to serial interface. I realise that this is a weird setup but it is OK for this testcase because we need the h4 protocol to be timing out for transmissions. Also the BCSP Bluetooth Radio Module may send BCSP frames which will exercise the h4 receive path although rejection of the frames should occur which is as expected. Please see the attached tarball v4.10.5.h4_crash.tgz, the contents are: v4.10.5.h4_crash/ v4.10.5.h4_crash/0001-btattach-Add-option-r-for-sending-HCI-RESET-on-close.patch v4.10.5.h4_crash/0002-btattach-Fix-raw_device-flag-usage.patch v4.10.5.h4_crash/0001-Bluetooth-Debug-shows-how-hci_uart_tty_close-is-call.patch v4.10.5.h4_crash/enable_hci_bt_logging.sh v4.10.5.h4_crash/btattach_loop_h4.sh v4.10.5.h4_crash/bt_hci_uart_ldisc_h4_crash_snippet_log.txt 0001-btattach-Add-option-r-for-sending-HCI-RESET-on-close.patch 0002-btattach-Fix-raw_device-flag-usage.patch These btattach patches add a command line option -r to override HCI_UART_RESET_ON_INIT so that the kernel HCI UART LDISC component can use the quirk HCI_QUIRK_RESET_ON_CLOSE. This means hci_uart_tty_close() is allowed to a send a HCI RESET command during closure of the HCI UART LDISC which increases the probability of a kernel crash occurring due to the design flaw in hci_uart_tty_close(). 0001-Bluetooth-Debug-shows-how-hci_uart_tty_close-is-call.patch is a kernel patch for v4.10.5 to add some debug to show when hci_uart_tty_close() gets called. enable_hci_bt_logging.sh Enabled dynamic debug of some Bluetooth files to allow dmesg to show when the Bluetooth functions run. However, this seems unreliable and the script needs to be run multiple times to get all the files to log properly. btattach_loop_h4.sh is a simple test script, and the content is: #!/bin/bash let i=1 while [ true ] do echo "loop $i" ./btattach -A /dev/ttyUSB0 -P h4 -r & sleep 5 echo "now killing..." killall btattach i=$(($i + 1)) done Note that I used my -r option. I suspect I should of used -B option, sigh. Here is the kernel crash backtrace from bt_hci_uart_ldisc_h4_crash_snippet_log.txt, it seems easy to reproduce: [ 563.702430] BUG: unable to handle kernel NULL pointer dereference at 000000000000001c [ 563.702491] IP: _raw_spin_lock_irqsave+0x22/0x40 [ 563.702520] PGD 13230e067 [ 563.702521] PUD 1238d6067 [ 563.702540] PMD 0 [ 563.704521] Oops: 0002 [#1] SMP [ 563.706215] Modules linked in: hci_uart btqca nf_log_ipv6 ip6table_nat nf_nat_ipv6 ip6t_REJECT nf_reject_ipv6 ip6table_mangle xt_set ip_set_hash_ip ip_set nf_log_ipv4 nf_log_common xt_LOG xt_recent iptable_nat nf_nat_ipv4 xt_comment ipt_REJECT nf_reject_ipv4 xt_addrtype bridge stp llc xt_mark iptable_mangle xt_tcpudp iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 xt_CT nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp ip6table_raw nf_nat_sip xt_multiport nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_h323 nf_nat_ftp xt_conntrack nf_nat_amanda nf_nat nf_conntrack_tftp nf_conntrack_sip nf_conntrack_sane nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nfnetlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp [ 563.715634] ts_kmp nf_conntrack_amanda nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables x_tables lockd grace ctr ccm af_packet bnep msr arc4 brcmsmac uvcvideo cordic brcmutil ftdi_sio videobuf2_vmalloc usbserial intel_powerclamp mac80211 videobuf2_memops btusb videobuf2_v4l2 btrtl coretemp snd_hda_codec_hdmi btbcm videobuf2_core btintel kvm_intel snd_hda_codec_realtek bluetooth kvm videodev joydev cfg80211 snd_hda_codec_generic media snd_hda_intel snd_hda_codec psmouse snd_hda_core irqbypass e1000e snd_hwdep input_leds snd_pcm iTCO_wdt iTCO_vendor_support serio_raw crc32c_intel toshiba_acpi snd_timer ptp mei_me fjes bcma snd mei thermal sparse_keymap toshiba_haps industrialio lpc_ich soundcore battery ac pps_core shpchp intel_ips toshiba_bluetooth wmi rfkill tpm_tis tpm_tis_core [ 563.726159] tpm cpufreq_ondemand cpufreq_conservative cpufreq_powersave acpi_cpufreq evdev nvram sunrpc sch_fq_codel ipv6 crc_ccitt autofs4 hid_generic usbhid hid sdhci_pci mmc_block sdhci sr_mod ehci_pci ehci_hcd mmc_core usbcore usb_common i915 video button i2c_algo_bit drm_kms_helper drm [ 563.730623] CPU: 0 PID: 29 Comm: kworker/0:1 Not tainted 4.10.54.10.5_debug+ #13 [ 563.732862] Hardware name: TOSHIBA Satellite R630/Portable PC, BIOS Version 1.40 07/20/2010 [ 563.735127] Workqueue: events hci_uart_write_work [hci_uart] [ 563.737376] task: ffffa03f32951b00 task.stack: ffffbc4ec0768000 [ 563.739620] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40 [ 563.741839] RSP: 0018:ffffbc4ec076bdd8 EFLAGS: 00010046 [ 563.744059] RAX: 0000000000000000 RBX: 0000000000000202 RCX: ffffa03f37c1cd60 [ 563.746286] RDX: 0000000000000001 RSI: ffffa03f37c18ae0 RDI: 000000000000001c [ 563.748493] RBP: ffffbc4ec076bde0 R08: ffffa03f32332380 R09: 0000000000000001 [ 563.750690] R10: ffffe9de427ce800 R11: ffffa03f33498600 R12: 0000000000000008 [ 563.752884] R13: 000000000000001c R14: ffffa03e79b1ad38 R15: ffffa03e79b1a240 [ 563.755075] FS: 0000000000000000(0000) GS:ffffa03f37c00000(0000) knlGS:0000000000000000 [ 563.757260] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 563.759426] CR2: 000000000000001c CR3: 000000011d7a3000 CR4: 00000000000006f0 [ 563.761610] Call Trace: [ 563.763798] skb_dequeue+0x1d/0x70 [ 563.765975] h4_dequeue+0x16/0x20 [hci_uart] [ 563.768128] hci_uart_write_work+0xc4/0x100 [hci_uart] [ 563.770262] process_one_work+0x151/0x410 [ 563.772368] worker_thread+0x69/0x4b0 [ 563.774466] kthread+0x114/0x150 [ 563.776519] ? rescuer_thread+0x340/0x340 [ 563.778546] ? kthread_park+0x90/0x90 [ 563.780566] ret_from_fork+0x2c/0x40 [ 563.782559] Code: 89 e5 e8 82 96 98 ff 5d c3 66 66 66 66 90 55 48 89 e5 53 9c 58 66 66 90 66 90 48 89 c3 fa 66 66 90 66 66 90 31 c0 ba 01 00 00 00 0f b1 17 85 c0 75 06 48 89 d8 5b 5d c3 89 c6 e8 29 83 98 ff [ 563.786800] RIP: _raw_spin_lock_irqsave+0x22/0x40 RSP: ffffbc4ec076bdd8 [ 563.788940] CR2: 000000000000001c [ 563.798389] ---[ end trace 326429e7baa9f359 ]--- Please see my analysis in bt_hci_uart_ldisc_h4_crash_snippet_log.txt With my patchset applied, this crash does not occur because locking is present which inhibits the Data Link protocol layer's dequeue function from running when the Data Link protocol layer is about to be closed down. I will try to provide some "good" logs of my patchset with btattach using BCSP. Thanks, Best regards, Dean -- Dean Jenkins Embedded Software Engineer Linux Transportation Solutions Mentor Embedded Software Division Mentor Graphics (UK) Ltd.