All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dean Jenkins <Dean_Jenkins@mentor.com>
To: Marcel Holtmann <marcel@holtmann.org>
Cc: "Gustavo F. Padovan" <gustavo@padovan.org>,
	Johan Hedberg <johan.hedberg@gmail.com>,
	<linux-bluetooth@vger.kernel.org>
Subject: Re: [RFC V1 00/16] hci_ldisc hci_uart_tty_close() fixes
Date: Tue, 4 Apr 2017 21:36:33 +0100	[thread overview]
Message-ID: <3662704c-dfd4-67db-a2f9-45c949c45c6c@mentor.com> (raw)
In-Reply-To: <119BB9FC-C735-405B-9A77-E9F102393B7D@holtmann.org>

[-- Attachment #1: Type: text/plain, Size: 7084 bytes --]

Hi Marcel,

On 03/04/17 16:51, Marcel Holtmann wrote:
>
> If this is an issue in 4.10, then lets get this fixed / hardened.
>

I have created a simple test script that leads to a kernel crash in HCI 
UART LDISC. Please note that this is a different scenario to the one 
documented in my previous E-mails but the resulting crash is similar.

I am using a BCSP supported Bluetooth Radio Module connected via a USB 
to serial interface to a 64-bit x86 Linux laptop.

I built kernel v4.10.5 although I messed up the .config settings so 
uname -r gave a strange string of4.10.54.10.5_debug+, sigh

Please note that "btattach" fails to establish a BCSP connection with 
the Bluetooth Radio Module but that is advantageous in exposing a fatal 
race condition in hci_uart_tty_close(). In this case, the BCSP Data Link 
protocol layer attempts to retransmit every 250ms and the BCSP layer 
gets closed down whilst still attempting to retransmit which causes a 
kernel crash.

My proposed patchset fixes this race condition plus some other issues.

Please refer to my attached tarball v4.10.5_bt_crash_info.tgz, the 
contents are:
v4.10.5_pure_testing/
v4.10.5_pure_testing/0001-Bluetooth-Debug-shows-how-hci_uart_tty_close-is-call.patch
v4.10.5_pure_testing/v4.10.5_pure_testing/btattach_loop.sh
v4.10.5_pure_testing/enable_hci_bt_logging.sh
v4.10.5_pure_testing/bt_hci_uart_ldisc_crash_snippet_log.txt

The patch file 
0001-Bluetooth-Debug-shows-how-hci_uart_tty_close-is-call.patch
is just adding debug so you can see that killing "btattach" causes 
do_group_exit() to run in the kernel which cleans-up by using 
tty_ldisc_kill() which calls tty_ldisc_close() which causes 
hci_uart_tty_close() to run.

enable_hci_bt_logging.sh
enables dynamic debug for dmesg although dynamic debug seems unreliable 
these days at least for me and I have to run enable_hci_bt_logging.sh 
multiple times to get all the files logging properly.

bt_hci_uart_ldisc_crash_snippet_log.txt
This shows my analysis of 2 example NULL pointer dereference kernel 
crashes which are easy to reproduce using my test script.

v4.10.5_pure_testing/btattach_loop.sh
This is the simple test script which causes a kernel crash within 10 
minutes of running, it is easily repeatable with my equipment. I think 
the kernel crash occurs before the loop counter reaches 100.

The contents of the script is:

#!/bin/bash

let i=1

while [ true ]
do
         echo "loop $i"
         ./btattach -A /dev/ttyUSB0 -P bcsp &
         sleep 5
         echo "now killing..."
         killall btattach
         i=$(($i + 1))
done

Example kernel crash backtrace:

[ 1561.709737] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000044
[ 1561.709829] IP: _raw_spin_lock_irqsave+0x22/0x40
[ 1561.709862] PGD 0

[ 1561.709889] Oops: 0002 [#1] SMP
[ 1561.709911] Modules linked in: ftdi_sio rfcomm hci_uart btqca xt_set 
ip_set_hash_ip nf_log_ipv6 ip_set nf_log_ipv4 nf_log_common xt_LOG 
ip6table_nat nf_nat_ipv6 xt_recent iptable_nat nf_nat_ipv4 ipt_REJECT 
nf_reject_ipv4 iptable_mangle iptable_raw nf_conntrack_ipv4 
nf_defrag_ipv4 xt_comment ip6t_REJECT nf_reject_ipv6 xt_addrtype bridge 
stp llc xt_mark ip6table_mangle nf_nat_tftp nf_nat_snmp_basic 
nf_conntrack_snmp xt_tcpudp nf_nat_sip xt_CT nf_nat_pptp 
nf_nat_proto_gre nf_nat_irc ip6table_raw nf_nat_h323 xt_multiport 
nf_nat_ftp nf_nat_amanda nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack 
nf_nat nf_conntrack_tftp nf_conntrack_sip nf_conntrack_sane 
nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nfnetlink 
nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc 
nf_conntrack_h323
[ 1561.710172]  nf_conntrack_ftp ts_kmp nf_conntrack_amanda nf_conntrack 
iptable_filter ip_tables ip6table_filter ip6_tables x_tables lockd grace 
ctr ccm af_packet bnep msr usbserial uvcvideo btusb btrtl arc4 brcmsmac 
btbcm videobuf2_vmalloc btintel cordic videobuf2_memops bluetooth 
intel_powerclamp brcmutil videobuf2_v4l2 videobuf2_core mac80211 
videodev coretemp kvm_intel kvm joydev media cfg80211 iTCO_wdt 
iTCO_vendor_support snd_hda_codec_hdmi snd_hda_codec_realtek psmouse 
snd_hda_codec_generic toshiba_acpi input_leds bcma snd_hda_intel 
irqbypass snd_hda_codec toshiba_bluetooth cpufreq_ondemand sparse_keymap 
crc32c_intel cpufreq_conservative industrialio snd_hda_core serio_raw 
wmi rfkill thermal cpufreq_powersave battery ac snd_hwdep snd_pcm 
acpi_cpufreq snd_timer snd toshiba_haps mei_me mei e1000e
[ 1561.718028]  fjes soundcore evdev ptp lpc_ich shpchp pps_core 
intel_ips tpm_tis tpm_tis_core nvram tpm sch_fq_codel sunrpc ipv6 
crc_ccitt autofs4 hid_generic usbhid hid sdhci_pci mmc_block sdhci 
sr_mod ehci_pci ehci_hcd mmc_core usbcore usb_common i915 video button 
i2c_algo_bit drm_kms_helper drm [last unloaded: ftdi_sio]
[ 1561.721317] CPU: 1 PID: 4473 Comm: kworker/1:0 Not tainted 
4.10.54.10.5_debug+ #12
[ 1561.722981] Hardware name: TOSHIBA Satellite R630/Portable PC, BIOS 
Version 1.40   07/20/2010
[ 1561.724660] Workqueue: events hci_uart_write_work [hci_uart]
[ 1561.726377] task: ffff98d9b4f15100 task.stack: ffffa868c1178000
[ 1561.728568] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
[ 1561.730355] RSP: 0000:ffffa868c117bda8 EFLAGS: 00010046
[ 1561.731993] RAX: 0000000000000000 RBX: 0000000000000296 RCX: 
0000000000079ad6
[ 1561.733646] RDX: 0000000000000001 RSI: ffff98da77c5c580 RDI: 
0000000000000044
[ 1561.735314] RBP: ffffa868c117bdb0 R08: 000000000001c5a0 R09: 
ffffffff9965a54a
[ 1561.736991] R10: fffff48441dc9a80 R11: ffff98da73403700 R12: 
0000000000000030
[ 1561.738641] R13: 0000000000000044 R14: 0000000000000030 R15: 
ffff98d9b4e50000
[ 1561.740288] FS:  0000000000000000(0000) GS:ffff98da77c40000(0000) 
knlGS:0000000000000000
[ 1561.741940] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1561.743585] CR2: 0000000000000044 CR3: 0000000128d1a000 CR4: 
00000000000006e0
[ 1561.745238] Call Trace:
[ 1561.746882]  skb_dequeue+0x1d/0x70
[ 1561.748512]  bcsp_dequeue+0x27/0x180 [hci_uart]
[ 1561.750130]  ? kfree_skbmem+0x5a/0x60
[ 1561.751742]  hci_uart_write_work+0xc4/0x100 [hci_uart]
[ 1561.753356]  process_one_work+0x151/0x410
[ 1561.754930]  worker_thread+0x69/0x4b0
[ 1561.756507]  kthread+0x114/0x150
[ 1561.758076]  ? rescuer_thread+0x340/0x340
[ 1561.759636]  ? kthread_park+0x90/0x90
[ 1561.761191]  ret_from_fork+0x2c/0x40
[ 1561.762693] Code: 89 e5 e8 82 96 98 ff 5d c3 66 66 66 66 90 55 48 89 
e5 53 9c 58 66 66 90 66 90 48 89 c3 fa 66 66 90 66 66 90 31 c0 ba 01 00 
00 00 <f0> 0f b1 17 85 c0 75 06 48 89 d8 5b 5d c3 89 c6 e8 29 83 98 ff
[ 1561.766034] RIP: _raw_spin_lock_irqsave+0x22/0x40 RSP: ffffa868c117bda8
[ 1561.768033] CR2: 0000000000000044
[ 1561.774555] ---[ end trace 51cc1d3575c0e559 ]---

Please see my analysis in bt_hci_uart_ldisc_crash_snippet_log.txt

With my patchset applied, this crash appears to no longer occur.

If I manage to produce some more useful results then I will post them.

Thanks,

Best regards,
Dean

-- 
Dean Jenkins
Embedded Software Engineer
Linux Transportation Solutions
Mentor Embedded Software Division
Mentor Graphics (UK) Ltd.


[-- Attachment #2: v4.10.5_bt_crash_info.tgz --]
[-- Type: application/x-compressed-tar, Size: 8119 bytes --]

  reply	other threads:[~2017-04-04 20:36 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-28 17:50 Dean Jenkins
2017-03-28 17:50 ` [RFC V1 01/16] Bluetooth: hci_ldisc: Add missing return in hci_uart_init_work() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 02/16] Bluetooth: hci_ldisc: Ensure hu->hdev set to NULL before freeing hdev Dean Jenkins
2017-03-28 17:50 ` [RFC V1 03/16] Bluetooth: hci_ldisc: Add missing clear HCI_UART_PROTO_READY Dean Jenkins
2017-03-28 17:50 ` [RFC V1 04/16] Bluetooth: hci_ldisc: Add HCI RESET comment to hci_unregister_dev() call Dean Jenkins
2017-03-30 10:11   ` Marcel Holtmann
2017-03-28 17:50 ` [RFC V1 05/16] Bluetooth: hci_ldisc: Add protocol check to hci_uart_send_frame() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 06/16] Bluetooth: hci_ldisc: Add protocol check to hci_uart_dequeue() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 07/16] Bluetooth: hci_ldisc: Add protocol check to hci_uart_tx_wakeup() Dean Jenkins
2017-03-30 10:11   ` Marcel Holtmann
2017-03-28 17:50 ` [RFC V1 08/16] Bluetooth: hci_ldisc: Separate flag handling in hci_uart_tty_close() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 09/16] Bluetooth: hci_ldisc: Tidy-up HCI_UART_REGISTERED " Dean Jenkins
2017-03-28 17:50 ` [RFC V1 10/16] Bluetooth: hci_ldisc: hci_uart_tty_close() detach tty after last msg Dean Jenkins
2017-03-28 17:50 ` [RFC V1 11/16] Bluetooth: hci_ldisc: hci_uart_tty_close() move hci_uart_close() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 12/16] Bluetooth: hci_ldisc: hci_uart_tty_close() move cancel_work_sync() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 13/16] Bluetooth: hci_ldisc: hci_uart_tty_close() free hu->tx_skb Dean Jenkins
2017-03-28 17:50 ` [RFC V1 14/16] Bluetooth: hci_ldisc: Simplify flushing Dean Jenkins
2017-03-28 17:50 ` [RFC V1 15/16] Bluetooth: hci_ldisc: Use rwlocking to avoid closing proto races Dean Jenkins
2017-03-28 17:50 ` [RFC V1 16/16] Bluetooth: hci_ldisc: Add ioctl_mutex avoiding concurrency Dean Jenkins
2017-03-30 10:11 ` [RFC V1 00/16] hci_ldisc hci_uart_tty_close() fixes Marcel Holtmann
2017-04-03 15:09   ` Dean Jenkins
2017-04-03 15:51     ` Marcel Holtmann
2017-04-04 20:36       ` Dean Jenkins [this message]
2017-04-05 15:28         ` Dean Jenkins
2017-04-06  7:23           ` Marcel Holtmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3662704c-dfd4-67db-a2f9-45c949c45c6c@mentor.com \
    --to=dean_jenkins@mentor.com \
    --cc=gustavo@padovan.org \
    --cc=johan.hedberg@gmail.com \
    --cc=linux-bluetooth@vger.kernel.org \
    --cc=marcel@holtmann.org \
    --subject='Re: [RFC V1 00/16] hci_ldisc hci_uart_tty_close() fixes' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.