All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcel Holtmann <marcel@holtmann.org>
To: Dean Jenkins <Dean_Jenkins@mentor.com>
Cc: "Gustavo F. Padovan" <gustavo@padovan.org>,
	Johan Hedberg <johan.hedberg@gmail.com>,
	linux-bluetooth@vger.kernel.org
Subject: Re: [RFC V1 00/16] hci_ldisc hci_uart_tty_close() fixes
Date: Mon, 3 Apr 2017 17:51:59 +0200	[thread overview]
Message-ID: <119BB9FC-C735-405B-9A77-E9F102393B7D@holtmann.org> (raw)
In-Reply-To: <ee49f939-1094-0ae0-0fa7-6c5cad112527@mentor.com>

Hi Dean,

>>> This is RFC patchset V1 which reorganises hci_uart_tty_close() to overcome a
>>> design flaw. I would like some comments on the changes.
>>> 
>>> Design Flaw
>>> ===========
>>> 
>>> An example callstack is as follows
>>> 
>>> Have Bluetooth running using a BCSP based UART Bluetooth Radio Module.
>>> 
>>> Now kill the userland hciattach program by doing
>>> killall hciattach
>> is there any chance we can convert BCSP support to run fully inside the kernel with the new parts we have put in. And with that then also use btattach. The split of some parts of BCSP in userspace seems never been a good idea.
> 
> I am not aware of "the new parts we [you] have put in" to the kernel because I am working with the older 3.14 kernel with userland components that are not Bluez based but the kernel issue is observable. Is there a web page where I can find out about your design changes for the new parts ?
> 
> My efforts are to improve the latest upstream kernel to eliminate this kernel design flaw in HCI UART LDISC (Note TTY LDISC is also broken but not fixed by my patchset).
> 
> I see that "btattach" is at https://git.kernel.org/pub/scm/bluetooth/bluez.git/tree/tools/btattach.c, however, I am unable to identify whether Linux distributions such as Ubuntu have a bluez package that contains "btattach". Is "btattach" a replacement for "hciattach” ?

yes, we want to move towards btattach that just assigned the line discipline and selects the UART protocol. Everything else including firmware download, speed changes, recover etc. should be done inside the kernel.

And later with serdev, we would not even need btattach anymore. UART based Bluetooth devices would be enumerated via DT and the TTY not even exposed to userspace. We are slowly getting to that point.

The latest kernel has UART drivers like hci_intel.c and hci_bcm.c which do a lot of things in the kernel. And btattach is just the process that keeps the line discipline open.

>> I am a bit reluctant to change major hci_ldisc pieces because of just one broken protocol. Running BCSP fully in the kernel seems a better solution to deal with some of these issues.
> 
> The kernel BCSP software in the kernel is not broken although it is not fully implemented as you already highlighted. The issue is that HCI UART LDISC (and TTY LDISC) has a broken procedure for closing down the HCI UART device via hci_uart_tty_close().
> 
> This means that I don't see how your suggestion helps to resolve the kernel design flaw which is related to closing down any of the Bluetooth Data Link protocol layers such as H4, H5, and BCSP (I use BCSP). This flaw seems to me to be a long standing Bluetooth kernel Data Link protocol layer closedown issue and is unrelated to how the Data Link protocol layer is established (connected). Therefore, having BCSP partly in userland is irrelevant to this kernel design flaw. Even with BCSP fully in the kernel, the protocol closedown issue will remain present I think.

If you think there are issues, then lets fix them for all protocols. I assumed this was BCSP specific.

> I might try to build "btattach" and have a go to use it. If you look inside the source code of "btattach" and "hciattach" you can see the problem area in closing down an established Bluetooth Data Link protocol layer by the use of:
> 
>    if (ioctl(fd, TIOCSETD, &ldisc) < 0) {
>        perror("Failed set serial line discipline");
>        close(fd);
>        return -1;
>    }
> 
> This userland call is the problem area as this asynchronous ioctl TIOCSETD can cause hci_uart_tty_close() to run and I think it can cause trouble for ALL the Bluetooth Data Link protocol layers such as H4, H5 and BCSP.
> 
> The design flaw is exposed after the Data Link protocol layer has been established (connected) and ioctl TIOCSETD is used from userland. In my example, I killed "hciattach" which is an abnormal scenario but it still needs good handling. I think I have strace evidence of TIOCSETD being used due to SIGTERM.
> 
> The design flaw is because TIOCSETD can trigger the sending of a HCI RESET command during closedown of HCI UART LDISC, TTY LDISC and the Data Link protocol layer. I only have experience of BCSP but I suspect H4 and H5 have retransmission procedures similar to BCSP so would also be susceptible to this issue of trying to send a HCI RESET command whilst closing down the needed data path to the UART driver which causes sending of the HCI RESET command to be unsuccessful.
> 
> I think the callstack is:
> 
> Userland ioctl TIOCSETD executes causing =>
> Kernel ioctl system call which runs
> tty_ioctl()
> tiocsetd()
> tty_set_ldisc()
> tty_ldisc_close()
> hci_uart_tty_close()
> hci_unregister_dev()
> hci_dev_do_close()
> __hci_req_sync() which tries to send a HCI RESET command which depends on
> HCI_QUIRK_RESET_ON_CLOSE being enabled and that is the default condition
> 
> I believe It will affect the closure of any of the Bluetooth Data Link protocol layers.
> 
> Note that not enabling HCI_QUIRK_RESET_ON_CLOSE does not fully help because if Data Link protocol layer retransmissions are occurring when hci_uart_tty_close() runs then the various race conditions are still present in hci_uart_tty_close().
> 
> I suspect evidence of the design flaw can be observed by measuring the execution time of the userland ioctl TIOCSETD calls. I predict that sometimes it will take 2 seconds for TIOCSETD to complete due to being blocked waiting for the unsuccessful attempt at sending the HCI RESET command because the HCI command time-out expires. I believe this will be independent of the underlying Bluetooth Data Link protocol layer.
> 
> Do you have any suggestions for moving forward in accepting my proposed changes ? I will try to provide more observable evidence of the issue on kernel v.4.10 on a Linux PC.

If this is an issue in 4.10, then lets get this fixed / hardened.

Regards

Marcel


  reply	other threads:[~2017-04-03 15:51 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-28 17:50 [RFC V1 00/16] hci_ldisc hci_uart_tty_close() fixes Dean Jenkins
2017-03-28 17:50 ` [RFC V1 01/16] Bluetooth: hci_ldisc: Add missing return in hci_uart_init_work() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 02/16] Bluetooth: hci_ldisc: Ensure hu->hdev set to NULL before freeing hdev Dean Jenkins
2017-03-28 17:50 ` [RFC V1 03/16] Bluetooth: hci_ldisc: Add missing clear HCI_UART_PROTO_READY Dean Jenkins
2017-03-28 17:50 ` [RFC V1 04/16] Bluetooth: hci_ldisc: Add HCI RESET comment to hci_unregister_dev() call Dean Jenkins
2017-03-30 10:11   ` Marcel Holtmann
2017-03-28 17:50 ` [RFC V1 05/16] Bluetooth: hci_ldisc: Add protocol check to hci_uart_send_frame() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 06/16] Bluetooth: hci_ldisc: Add protocol check to hci_uart_dequeue() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 07/16] Bluetooth: hci_ldisc: Add protocol check to hci_uart_tx_wakeup() Dean Jenkins
2017-03-30 10:11   ` Marcel Holtmann
2017-03-28 17:50 ` [RFC V1 08/16] Bluetooth: hci_ldisc: Separate flag handling in hci_uart_tty_close() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 09/16] Bluetooth: hci_ldisc: Tidy-up HCI_UART_REGISTERED " Dean Jenkins
2017-03-28 17:50 ` [RFC V1 10/16] Bluetooth: hci_ldisc: hci_uart_tty_close() detach tty after last msg Dean Jenkins
2017-03-28 17:50 ` [RFC V1 11/16] Bluetooth: hci_ldisc: hci_uart_tty_close() move hci_uart_close() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 12/16] Bluetooth: hci_ldisc: hci_uart_tty_close() move cancel_work_sync() Dean Jenkins
2017-03-28 17:50 ` [RFC V1 13/16] Bluetooth: hci_ldisc: hci_uart_tty_close() free hu->tx_skb Dean Jenkins
2017-03-28 17:50 ` [RFC V1 14/16] Bluetooth: hci_ldisc: Simplify flushing Dean Jenkins
2017-03-28 17:50 ` [RFC V1 15/16] Bluetooth: hci_ldisc: Use rwlocking to avoid closing proto races Dean Jenkins
2017-03-28 17:50 ` [RFC V1 16/16] Bluetooth: hci_ldisc: Add ioctl_mutex avoiding concurrency Dean Jenkins
2017-03-30 10:11 ` [RFC V1 00/16] hci_ldisc hci_uart_tty_close() fixes Marcel Holtmann
2017-04-03 15:09   ` Dean Jenkins
2017-04-03 15:51     ` Marcel Holtmann [this message]
2017-04-04 20:36       ` Dean Jenkins
2017-04-05 15:28         ` Dean Jenkins
2017-04-06  7:23           ` Marcel Holtmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=119BB9FC-C735-405B-9A77-E9F102393B7D@holtmann.org \
    --to=marcel@holtmann.org \
    --cc=Dean_Jenkins@mentor.com \
    --cc=gustavo@padovan.org \
    --cc=johan.hedberg@gmail.com \
    --cc=linux-bluetooth@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.