linux-bluetooth.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] Bluetooth: Add a workaround for SCO over USB HCI design defect
@ 2022-10-03 14:25 Nicolas Cavallari
  2022-10-03 15:14 ` Paul Menzel
  2022-10-03 15:23 ` [RFC] " bluez.test.bot
  0 siblings, 2 replies; 4+ messages in thread
From: Nicolas Cavallari @ 2022-10-03 14:25 UTC (permalink / raw)
  To: Marcel Holtmann, Johan Hedberg, Luiz Augusto von Dentz, linux-bluetooth

The USB interface between the host and the bluetooth adapter used for
SCO packets uses an USB isochronous endpoint with a fragmentation scheme
that does not tolerate errors.  Except USB isochronous transfers do
not provide a reliable stream with guaranteed delivery (There is no
retry on error, see USB spec v2.0 5.6 and 8.5.5).

To fragment a packet, the bluetooth HCI simply split it in parts and
transfer them as-is.  The receiver is expected to reconstruct the packet
by assuming the first fragment contains the header and parsing its size
field.  There is no error detection either.

If a fragment is lost, the end result is that the kernel is no longer
synchronized and will pass malformed data to the upper layers, since it
has no way to tell if the first fragment is an actual first fragment or
a continuation fragment.  Resynchronization can only happen by luck and
requires an unbounded amount of time.

The typical symptom for a HSP/HFP bluetooth headset is that the
microphone stops working and dmesg contains piles of rate-limited
"Bluetooth: hci0: SCO packet for unknown connection handle XXXX"
errors for an undeterminate amount of time, until the kernel accidently
resynchronize.

A workaround is to ask the upper layer to prevalidate the first fragment
header.  This is not possible with user channels so this workaround is
disabled in this case.

Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
---
 drivers/bluetooth/btusb.c        |  7 +++++--
 include/net/bluetooth/hci_core.h | 20 ++++++++++++++++++++
 2 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
index 15caa6469538..f6256af98233 100644
--- a/drivers/bluetooth/btusb.c
+++ b/drivers/bluetooth/btusb.c
@@ -980,9 +980,12 @@ static int btusb_recv_isoc(struct btusb_data *data, void *buffer, int count)
 
 		if (skb->len == HCI_SCO_HDR_SIZE) {
 			/* Complete SCO header */
-			hci_skb_expect(skb) = hci_sco_hdr(skb)->dlen;
+			struct hci_sco_hdr *hdr = hci_sco_hdr(skb);
 
-			if (skb_tailroom(skb) < hci_skb_expect(skb)) {
+			hci_skb_expect(skb) = hdr->dlen;
+
+			if (skb_tailroom(skb) < hci_skb_expect(skb) ||
+			    !hci_conn_prevalidate_sco_hdr(data->hdev, hdr)) {
 				kfree_skb(skb);
 				skb = NULL;
 
diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
index e7862903187d..d589b54e89e6 100644
--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -1286,6 +1286,26 @@ static inline struct hci_conn *hci_lookup_le_connect(struct hci_dev *hdev)
 	return NULL;
 }
 
+static inline bool hci_conn_prevalidate_sco_hdr(struct hci_dev *hdev,
+						struct hci_sco_hdr *hdr)
+{
+	__u16 handle;
+
+	if (hci_dev_test_flag(hdev, HCI_USER_CHANNEL))
+		// Can't validate, userspace controls everything.
+		return true;
+
+	handle = hci_handle(__le16_to_cpu(hdr->handle));
+
+	switch (hci_conn_lookup_type(hdev, handle)) {
+	case SCO_LINK:
+	case ESCO_LINK:
+		return true;
+	default:
+		return false;
+	}
+}
+
 int hci_disconnect(struct hci_conn *conn, __u8 reason);
 bool hci_setup_sync(struct hci_conn *conn, __u16 handle);
 void hci_sco_setup(struct hci_conn *conn, __u8 status);
-- 
2.37.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH] Bluetooth: Add a workaround for SCO over USB HCI design defect
  2022-10-03 14:25 [RFC PATCH] Bluetooth: Add a workaround for SCO over USB HCI design defect Nicolas Cavallari
@ 2022-10-03 15:14 ` Paul Menzel
  2022-10-04  8:02   ` Nicolas Cavallari
  2022-10-03 15:23 ` [RFC] " bluez.test.bot
  1 sibling, 1 reply; 4+ messages in thread
From: Paul Menzel @ 2022-10-03 15:14 UTC (permalink / raw)
  To: Nicolas Cavallari
  Cc: Marcel Holtmann, Johan Hedberg, Luiz Augusto von Dentz, linux-bluetooth

Dear Nicolas,


Thank you for the patch. Just a small suggestion to make the 
summary/title shorter:

Work around SCO over USB HCI design defect

Am 03.10.22 um 16:25 schrieb Nicolas Cavallari:
> The USB interface between the host and the bluetooth adapter used for
> SCO packets uses an USB isochronous endpoint with a fragmentation scheme
> that does not tolerate errors.  Except USB isochronous transfers do
> not provide a reliable stream with guaranteed delivery (There is no
> retry on error, see USB spec v2.0 5.6 and 8.5.5).

I’d put the dot/period after delivery, and inside the brackets after 8.5.5.

> To fragment a packet, the bluetooth HCI simply split it in parts and

split*s*

> transfer them as-is.  The receiver is expected to reconstruct the packet
> by assuming the first fragment contains the header and parsing its size
> field.  There is no error detection either.
> 
> If a fragment is lost, the end result is that the kernel is no longer
> synchronized and will pass malformed data to the upper layers, since it
> has no way to tell if the first fragment is an actual first fragment or
> a continuation fragment.  Resynchronization can only happen by luck and
> requires an unbounded amount of time.
> 
> The typical symptom for a HSP/HFP bluetooth headset is that the
> microphone stops working and dmesg contains piles of rate-limited
> "Bluetooth: hci0: SCO packet for unknown connection handle XXXX"
> errors for an undeterminate amount of time, until the kernel accidently

indeterminate, accidentally

> resynchronize.
> 
> A workaround is to ask the upper layer to prevalidate the first fragment
> header.  This is not possible with user channels so this workaround is
> disabled in this case.

Please document your test setup.

Please excuse my ignorance, but I have a few more questions:

1.  Does that also happen with Android?
2.  Is it possible to reproduce in QEMU for example?


Kind regards,

Paul


> Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
> ---
>   drivers/bluetooth/btusb.c        |  7 +++++--
>   include/net/bluetooth/hci_core.h | 20 ++++++++++++++++++++
>   2 files changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c
> index 15caa6469538..f6256af98233 100644
> --- a/drivers/bluetooth/btusb.c
> +++ b/drivers/bluetooth/btusb.c
> @@ -980,9 +980,12 @@ static int btusb_recv_isoc(struct btusb_data *data, void *buffer, int count)
>   
>   		if (skb->len == HCI_SCO_HDR_SIZE) {
>   			/* Complete SCO header */
> -			hci_skb_expect(skb) = hci_sco_hdr(skb)->dlen;
> +			struct hci_sco_hdr *hdr = hci_sco_hdr(skb);
>   
> -			if (skb_tailroom(skb) < hci_skb_expect(skb)) {
> +			hci_skb_expect(skb) = hdr->dlen;
> +
> +			if (skb_tailroom(skb) < hci_skb_expect(skb) ||
> +			    !hci_conn_prevalidate_sco_hdr(data->hdev, hdr)) {
>   				kfree_skb(skb);
>   				skb = NULL;
>   
> diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
> index e7862903187d..d589b54e89e6 100644
> --- a/include/net/bluetooth/hci_core.h
> +++ b/include/net/bluetooth/hci_core.h
> @@ -1286,6 +1286,26 @@ static inline struct hci_conn *hci_lookup_le_connect(struct hci_dev *hdev)
>   	return NULL;
>   }
>   
> +static inline bool hci_conn_prevalidate_sco_hdr(struct hci_dev *hdev,
> +						struct hci_sco_hdr *hdr)
> +{
> +	__u16 handle;
> +
> +	if (hci_dev_test_flag(hdev, HCI_USER_CHANNEL))
> +		// Can't validate, userspace controls everything.
> +		return true;
> +
> +	handle = hci_handle(__le16_to_cpu(hdr->handle));
> +
> +	switch (hci_conn_lookup_type(hdev, handle)) {
> +	case SCO_LINK:
> +	case ESCO_LINK:
> +		return true;
> +	default:
> +		return false;
> +	}
> +}
> +
>   int hci_disconnect(struct hci_conn *conn, __u8 reason);
>   bool hci_setup_sync(struct hci_conn *conn, __u16 handle);
>   void hci_sco_setup(struct hci_conn *conn, __u8 status);

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [RFC] Bluetooth: Add a workaround for SCO over USB HCI design defect
  2022-10-03 14:25 [RFC PATCH] Bluetooth: Add a workaround for SCO over USB HCI design defect Nicolas Cavallari
  2022-10-03 15:14 ` Paul Menzel
@ 2022-10-03 15:23 ` bluez.test.bot
  1 sibling, 0 replies; 4+ messages in thread
From: bluez.test.bot @ 2022-10-03 15:23 UTC (permalink / raw)
  To: linux-bluetooth, nicolas.cavallari

[-- Attachment #1: Type: text/plain, Size: 1208 bytes --]

This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=682780

---Test result---

Test Summary:
CheckPatch                    PASS      2.27 seconds
GitLint                       PASS      1.12 seconds
SubjectPrefix                 PASS      0.88 seconds
BuildKernel                   PASS      38.94 seconds
BuildKernel32                 PASS      35.46 seconds
Incremental Build with patchesPASS      48.99 seconds
TestRunner: Setup             PASS      580.44 seconds
TestRunner: l2cap-tester      PASS      19.04 seconds
TestRunner: iso-tester        PASS      18.33 seconds
TestRunner: bnep-tester       PASS      7.33 seconds
TestRunner: mgmt-tester       PASS      112.41 seconds
TestRunner: rfcomm-tester     PASS      11.39 seconds
TestRunner: sco-tester        PASS      10.90 seconds
TestRunner: ioctl-tester      PASS      12.24 seconds
TestRunner: smp-tester        PASS      10.71 seconds
TestRunner: userchan-tester   PASS      7.55 seconds



---
Regards,
Linux Bluetooth


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC PATCH] Bluetooth: Add a workaround for SCO over USB HCI design defect
  2022-10-03 15:14 ` Paul Menzel
@ 2022-10-04  8:02   ` Nicolas Cavallari
  0 siblings, 0 replies; 4+ messages in thread
From: Nicolas Cavallari @ 2022-10-04  8:02 UTC (permalink / raw)
  To: Paul Menzel
  Cc: Marcel Holtmann, Johan Hedberg, Luiz Augusto von Dentz, linux-bluetooth

On 03/10/2022 17:14, Paul Menzel wrote:
> Dear Nicolas,
> 
> 
> Thank you for the patch. Just a small suggestion to make the 
> summary/title shorter:
> 
> Work around SCO over USB HCI design defect
> 
> Am 03.10.22 um 16:25 schrieb Nicolas Cavallari:
>> The USB interface between the host and the bluetooth adapter used for
>> SCO packets uses an USB isochronous endpoint with a fragmentation scheme
>> that does not tolerate errors.  Except USB isochronous transfers do
>> not provide a reliable stream with guaranteed delivery (There is no
>> retry on error, see USB spec v2.0 5.6 and 8.5.5).
> 
> I’d put the dot/period after delivery, and inside the brackets after 8.5.5.
> 
>> To fragment a packet, the bluetooth HCI simply split it in parts and
> 
> split*s*
> 
>> transfer them as-is.  The receiver is expected to reconstruct the packet
>> by assuming the first fragment contains the header and parsing its size
>> field.  There is no error detection either.
>>
>> If a fragment is lost, the end result is that the kernel is no longer
>> synchronized and will pass malformed data to the upper layers, since it
>> has no way to tell if the first fragment is an actual first fragment or
>> a continuation fragment.  Resynchronization can only happen by luck and
>> requires an unbounded amount of time.
>>
>> The typical symptom for a HSP/HFP bluetooth headset is that the
>> microphone stops working and dmesg contains piles of rate-limited
>> "Bluetooth: hci0: SCO packet for unknown connection handle XXXX"
>> errors for an undeterminate amount of time, until the kernel accidently
> 
> indeterminate, accidentally
> 

Thanks for the typos, will fix them in v2.

>> resynchronize.
>>
>> A workaround is to ask the upper layer to prevalidate the first fragment
>> header.  This is not possible with user channels so this workaround is
>> disabled in this case.
> 
> Please document your test setup.

My current worst case is a ath3k (AR3012) on an imx6 board (GW52xx)
running 5.19.8. The issue does not depend on the hfp/hsp headset.

It should be noted that the AR3012 is an USB1 device and it is plugged 
into a USB2 root hub. This is already a special case for USB.

What happens when looking at the usbmon traces is that around every 
~2s-3s, what should be a 9 byte fragment is replaced by a 0 byte 
fragment, which is still reported as successfully transfered.

> Please excuse my ignorance, but I have a few more questions:
> 
> 1.  Does that also happen with Android?

I haven't tested to run android on this platform, but is the android 
kernel that different from a vanilla linux ?

> 2.  Is it possible to reproduce in QEMU for example?

This platform is probably not powerful enough to run qemu.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-10-04  8:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-03 14:25 [RFC PATCH] Bluetooth: Add a workaround for SCO over USB HCI design defect Nicolas Cavallari
2022-10-03 15:14 ` Paul Menzel
2022-10-04  8:02   ` Nicolas Cavallari
2022-10-03 15:23 ` [RFC] " bluez.test.bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).