All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dean Jenkins <Dean_Jenkins@mentor.com>
To: Marcel Holtmann <marcel@holtmann.org>
Cc: "Gustavo F. Padovan" <gustavo@padovan.org>,
	Johan Hedberg <johan.hedberg@gmail.com>,
	<linux-bluetooth@vger.kernel.org>
Subject: Re: [RFC V1 00/16] hci_ldisc hci_uart_tty_close() fixes
Date: Tue, 4 Apr 2017 21:36:33 +0100	[thread overview]
Message-ID: <3662704c-dfd4-67db-a2f9-45c949c45c6c@mentor.com> (raw)
In-Reply-To: <119BB9FC-C735-405B-9A77-E9F102393B7D@holtmann.org>

[-- Attachment #1: Type: text/plain, Size: 7084 bytes --]

Hi Marcel,

On 03/04/17 16:51, Marcel Holtmann wrote:
>
> If this is an issue in 4.10, then lets get this fixed / hardened.
>

I have created a simple test script that leads to a kernel crash in HCI 
UART LDISC. Please note that this is a different scenario to the one 
documented in my previous E-mails but the resulting crash is similar.

I am using a BCSP supported Bluetooth Radio Module connected via a USB 
to serial interface to a 64-bit x86 Linux laptop.

I built kernel v4.10.5 although I messed up the .config settings so 
uname -r gave a strange string of4.10.54.10.5_debug+, sigh

Please note that "btattach" fails to establish a BCSP connection with 
the Bluetooth Radio Module but that is advantageous in exposing a fatal 
race condition in hci_uart_tty_close(). In this case, the BCSP Data Link 
protocol layer attempts to retransmit every 250ms and the BCSP layer 
gets closed down whilst still attempting to retransmit which causes a 
kernel crash.

My proposed patchset fixes this race condition plus some other issues.

Please refer to my attached tarball v4.10.5_bt_crash_info.tgz, the 
contents are:
v4.10.5_pure_testing/
v4.10.5_pure_testing/0001-Bluetooth-Debug-shows-how-hci_uart_tty_close-is-call.patch
v4.10.5_pure_testing/v4.10.5_pure_testing/btattach_loop.sh
v4.10.5_pure_testing/enable_hci_bt_logging.sh
v4.10.5_pure_testing/bt_hci_uart_ldisc_crash_snippet_log.txt

The patch file 
0001-Bluetooth-Debug-shows-how-hci_uart_tty_close-is-call.patch
is just adding debug so you can see that killing "btattach" causes 
do_group_exit() to run in the kernel which cleans-up by using 
tty_ldisc_kill() which calls tty_ldisc_close() which causes 
hci_uart_tty_close() to run.

enable_hci_bt_logging.sh
enables dynamic debug for dmesg although dynamic debug seems unreliable 
these days at least for me and I have to run enable_hci_bt_logging.sh 
multiple times to get all the files logging properly.

bt_hci_uart_ldisc_crash_snippet_log.txt
This shows my analysis of 2 example NULL pointer dereference kernel 
crashes which are easy to reproduce using my test script.

v4.10.5_pure_testing/btattach_loop.sh
This is the simple test script which causes a kernel crash within 10 
minutes of running, it is easily repeatable with my equipment. I think 
the kernel crash occurs before the loop counter reaches 100.

The contents of the script is:

#!/bin/bash

let i=1

while [ true ]
do
         echo "loop $i"
         ./btattach -A /dev/ttyUSB0 -P bcsp &
         sleep 5
         echo "now killing..."
         killall btattach
         i=$(($i + 1))
done

Example kernel crash backtrace:

[ 1561.709737] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000044
[ 1561.709829] IP: _raw_spin_lock_irqsave+0x22/0x40
[ 1561.709862] PGD 0

[ 1561.709889] Oops: 0002 [#1] SMP
[ 1561.709911] Modules linked in: ftdi_sio rfcomm hci_uart btqca xt_set 
ip_set_hash_ip nf_log_ipv6 ip_set nf_log_ipv4 nf_log_common xt_LOG 
ip6table_nat nf_nat_ipv6 xt_recent iptable_nat nf_nat_ipv4 ipt_REJECT 
nf_reject_ipv4 iptable_mangle iptable_raw nf_conntrack_ipv4 
nf_defrag_ipv4 xt_comment ip6t_REJECT nf_reject_ipv6 xt_addrtype bridge 
stp llc xt_mark ip6table_mangle nf_nat_tftp nf_nat_snmp_basic 
nf_conntrack_snmp xt_tcpudp nf_nat_sip xt_CT nf_nat_pptp 
nf_nat_proto_gre nf_nat_irc ip6table_raw nf_nat_h323 xt_multiport 
nf_nat_ftp nf_nat_amanda nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack 
nf_nat nf_conntrack_tftp nf_conntrack_sip nf_conntrack_sane 
nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nfnetlink 
nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc 
nf_conntrack_h323
[ 1561.710172]  nf_conntrack_ftp ts_kmp nf_conntrack_amanda nf_conntrack 
iptable_filter ip_tables ip6table_filter ip6_tables x_tables lockd grace 
ctr ccm af_packet bnep msr usbserial uvcvideo btusb btrtl arc4 brcmsmac 
btbcm videobuf2_vmalloc btintel cordic videobuf2_memops bluetooth 
intel_powerclamp brcmutil videobuf2_v4l2 videobuf2_core mac80211 
videodev coretemp kvm_intel kvm joydev media cfg80211 iTCO_wdt 
iTCO_vendor_support snd_hda_codec_hdmi snd_hda_codec_realtek psmouse 
snd_hda_codec_generic toshiba_acpi input_leds bcma snd_hda_intel 
irqbypass snd_hda_codec toshiba_bluetooth cpufreq_ondemand sparse_keymap 
crc32c_intel cpufreq_conservative industrialio snd_hda_core serio_raw 
wmi rfkill thermal cpufreq_powersave battery ac snd_hwdep snd_pcm 
acpi_cpufreq snd_timer snd toshiba_haps mei_me mei e1000e
[ 1561.718028]  fjes soundcore evdev ptp lpc_ich shpchp pps_core 
intel_ips tpm_tis tpm_tis_core nvram tpm sch_fq_codel sunrpc ipv6 
crc_ccitt autofs4 hid_generic usbhid hid sdhci_pci mmc_block sdhci 
sr_mod ehci_pci ehci_hcd mmc_core usbcore usb_common i915 video button 
i2c_algo_bit drm_kms_helper drm [last unloaded: ftdi_sio]
[ 1561.721317] CPU: 1 PID: 4473 Comm: kworker/1:0 Not tainted 
4.10.54.10.5_debug+ #12
[ 1561.722981] Hardware name: TOSHIBA Satellite R630/Portable PC, BIOS 
Version 1.40   07/20/2010
[ 1561.724660] Workqueue: events hci_uart_write_work [hci_uart]
[ 1561.726377] task: ffff98d9b4f15100 task.stack: ffffa868c1178000
[ 1561.728568] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
[ 1561.730355] RSP: 0000:ffffa868c117bda8 EFLAGS: 00010046
[ 1561.731993] RAX: 0000000000000000 RBX: 0000000000000296 RCX: 
0000000000079ad6
[ 1561.733646] RDX: 0000000000000001 RSI: ffff98da77c5c580 RDI: 
0000000000000044
[ 1561.735314] RBP: ffffa868c117bdb0 R08: 000000000001c5a0 R09: 
ffffffff9965a54a
[ 1561.736991] R10: fffff48441dc9a80 R11: ffff98da73403700 R12: 
0000000000000030
[ 1561.738641] R13: 0000000000000044 R14: 0000000000000030 R15: 
ffff98d9b4e50000
[ 1561.740288] FS:  0000000000000000(0000) GS:ffff98da77c40000(0000) 
knlGS:0000000000000000
[ 1561.741940] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1561.743585] CR2: 0000000000000044 CR3: 0000000128d1a000 CR4: 
00000000000006e0
[ 1561.745238] Call Trace:
[ 1561.746882]  skb_dequeue+0x1d/0x70
[ 1561.748512]  bcsp_dequeue+0x27/0x180 [hci_uart]
[ 1561.750130]  ? kfree_skbmem+0x5a/0x60
[ 1561.751742]  hci_uart_write_work+0xc4/0x100 [hci_uart]
[ 1561.753356]  process_one_work+0x151/0x410
[ 1561.754930]  worker_thread+0x69/0x4b0
[ 1561.756507]  kthread+0x114/0x150
[ 1561.758076]  ? rescuer_thread+0x340/0x340
[ 1561.759636]  ? kthread_park+0x90/0x90
[ 1561.761191]  ret_from_fork+0x2c/0x40
[ 1561.762693] Code: 89 e5 e8 82 96 98 ff 5d c3 66 66 66 66 90 55 48 89 
e5 53 9c 58 66 66 90 66 90 48 89 c3 fa 66 66 90 66 66 90 31 c0 ba 01 00 
00 00 <f0> 0f b1 17 85 c0 75 06 48 89 d8 5b 5d c3 89 c6 e8 29 83 98 ff
[ 1561.766034] RIP: _raw_spin_lock_irqsave+0x22/0x40 RSP: ffffa868c117bda8
[ 1561.768033] CR2: 0000000000000044
[ 1561.774555] ---[ end trace 51cc1d3575c0e559 ]---

Please see my analysis in bt_hci_uart_ldisc_crash_snippet_log.txt

With my patchset applied, this crash appears to no longer occur.

If I manage to produce some more useful results then I will post them.

Thanks,

Best regards,
Dean

-- 
Dean Jenkins
Embedded Software Engineer
Linux Transportation Solutions
Mentor Embedded Software Division
Mentor Graphics (UK) Ltd.


[-- Attachment #2: v4.10.5_bt_crash_info.tgz --]
[-- Type: application/x-compressed-tar, Size: 8119 bytes --]

  reply	other threads:[~2017-04-04 20:36 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-28 17:50 [RFC V1 00/16] hci_ldisc hci_uart_tty_close() fixes Dean Jenkins
2017-03-28 17:50 ` [RFC V1 01/16] Bluetooth: hci_ldisc: Add missing return in hci_uart_init_work() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 02/16] Bluetooth: hci_ldisc: Ensure hu->hdev set to NULL before freeing hdev Dean Jenkins
2017-03-28 17:50 ` [RFC V1 03/16] Bluetooth: hci_ldisc: Add missing clear HCI_UART_PROTO_READY Dean Jenkins
2017-03-28 17:50 ` [RFC V1 04/16] Bluetooth: hci_ldisc: Add HCI RESET comment to hci_unregister_dev() call Dean Jenkins
2017-03-30 10:11   ` Marcel Holtmann
2017-03-28 17:50 ` [RFC V1 05/16] Bluetooth: hci_ldisc: Add protocol check to hci_uart_send_frame() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 06/16] Bluetooth: hci_ldisc: Add protocol check to hci_uart_dequeue() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 07/16] Bluetooth: hci_ldisc: Add protocol check to hci_uart_tx_wakeup() Dean Jenkins
2017-03-30 10:11   ` Marcel Holtmann
2017-03-28 17:50 ` [RFC V1 08/16] Bluetooth: hci_ldisc: Separate flag handling in hci_uart_tty_close() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 09/16] Bluetooth: hci_ldisc: Tidy-up HCI_UART_REGISTERED " Dean Jenkins
2017-03-28 17:50 ` [RFC V1 10/16] Bluetooth: hci_ldisc: hci_uart_tty_close() detach tty after last msg Dean Jenkins
2017-03-28 17:50 ` [RFC V1 11/16] Bluetooth: hci_ldisc: hci_uart_tty_close() move hci_uart_close() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 12/16] Bluetooth: hci_ldisc: hci_uart_tty_close() move cancel_work_sync() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 13/16] Bluetooth: hci_ldisc: hci_uart_tty_close() free hu->tx_skb Dean Jenkins
2017-03-28 17:50 ` [RFC V1 14/16] Bluetooth: hci_ldisc: Simplify flushing Dean Jenkins
2017-03-28 17:50 ` [RFC V1 15/16] Bluetooth: hci_ldisc: Use rwlocking to avoid closing proto races Dean Jenkins
2017-03-28 17:50 ` [RFC V1 16/16] Bluetooth: hci_ldisc: Add ioctl_mutex avoiding concurrency Dean Jenkins
2017-03-30 10:11 ` [RFC V1 00/16] hci_ldisc hci_uart_tty_close() fixes Marcel Holtmann
2017-04-03 15:09   ` Dean Jenkins
2017-04-03 15:51     ` Marcel Holtmann
2017-04-04 20:36       ` Dean Jenkins [this message]
2017-04-05 15:28         ` Dean Jenkins
2017-04-06  7:23           ` Marcel Holtmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3662704c-dfd4-67db-a2f9-45c949c45c6c@mentor.com \
    --to=dean_jenkins@mentor.com \
    --cc=gustavo@padovan.org \
    --cc=johan.hedberg@gmail.com \
    --cc=linux-bluetooth@vger.kernel.org \
    --cc=marcel@holtmann.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.