netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Bluetooth: Power down controller when suspending
@ 2020-11-18 23:43 Abhishek Pandit-Subedi
  2020-11-18 23:43 ` [PATCH 1/3] Bluetooth: Rename and move clean_up_hci_state Abhishek Pandit-Subedi
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Abhishek Pandit-Subedi @ 2020-11-18 23:43 UTC (permalink / raw)
  To: marcel, linux-bluetooth
  Cc: chromeos-bluetooth-upstreaming, mcchou, danielwinkler,
	Abhishek Pandit-Subedi, David S. Miller, Johan Hedberg, netdev,
	linux-kernel, Jakub Kicinski


Hi Marcel and linux-bluetooth,

This patch series adds support for a quirk that will power down the
Bluetooth controller when suspending and power it back up when resuming.

On Marvell SDIO Bluetooth controllers (SD8897 and SD8997), we are seeing
a large number of suspend failures with the following log messages:

[ 4764.773873] Bluetooth: hci_cmd_timeout() hci0 command 0x0c14 tx timeout
[ 4767.777897] Bluetooth: btmrvl_enable_hs() Host sleep enable command failed
[ 4767.777920] Bluetooth: btmrvl_sdio_suspend() HS not actived, suspend failed!
[ 4767.777946] dpm_run_callback(): pm_generic_suspend+0x0/0x48 returns -16
[ 4767.777963] call mmc2:0001:2+ returned -16 after 4882288 usecs

The daily failure rate with this signature is quite significant and
users are likely facing this at least once a day (and some unlucky users
are likely facing it multiple times a day).

Given the severity, we'd like to power off the controller during suspend
so the driver doesn't need to take any action (or block in any way) when
suspending and power on during resume. This will break wake-on-bt for
users but should improve the reliability of suspend.

We don't want to force all users of MVL8897 and MVL8997 to encounter
this behavior if they're not affected (especially users that depend on
Bluetooth for keyboard/mouse input) so the new behavior is enabled via
module param. We are limiting this quirk to only Chromebooks (i.e.
laptop). Chromeboxes will continue to have the old behavior since users
may depend on BT HID to wake and use the system.

These changes were tested in the following ways on a Chromebook running
the 4.19 kernel and a MVL-SD8897 chipset. We added the module param in
/etc/modprobe.d/btmrvl_sdio.conf with the contents
  "options btmrvl_sdio power_down_suspend=Y".

Tests run:

With no devices paired:
- suspend_stress_test --wake_min 10 --suspend_min 10 --count 500

With an LE keyboard paired:
- suspend_stress_test --wake_min 10 --suspend_min 10 --count 500

Using the ChromeOS AVL test suite (stress tests are 25 iterations):
- bluetooth_AdapterSRHealth (basic suite)
- bluetooth_AdapterSRHealth.sr_reconnect_classic_hid_stress
- bluetooth_AdapterSRHealth.sr_reconnect_le_hid_stress

Thanks,
Abhishek


Abhishek Pandit-Subedi (3):
  Bluetooth: Rename and move clean_up_hci_state
  Bluetooth: Add quirk to power down on suspend
  Bluetooth: btmrvl_sdio: Power down when suspending

 drivers/bluetooth/btmrvl_sdio.c  | 10 ++++
 include/net/bluetooth/hci.h      |  7 +++
 include/net/bluetooth/hci_core.h |  6 +++
 net/bluetooth/hci_core.c         | 93 +++++++++++++++++++++++++++++++-
 net/bluetooth/hci_request.c      | 26 ++++++++-
 net/bluetooth/mgmt.c             | 46 +---------------
 6 files changed, 140 insertions(+), 48 deletions(-)

-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] Bluetooth: Rename and move clean_up_hci_state
  2020-11-18 23:43 [PATCH 0/3] Bluetooth: Power down controller when suspending Abhishek Pandit-Subedi
@ 2020-11-18 23:43 ` Abhishek Pandit-Subedi
  2020-11-18 23:43 ` [PATCH 2/3] Bluetooth: Add quirk to power down on suspend Abhishek Pandit-Subedi
  2020-11-23 11:46 ` [PATCH 0/3] Bluetooth: Power down controller when suspending Marcel Holtmann
  2 siblings, 0 replies; 7+ messages in thread
From: Abhishek Pandit-Subedi @ 2020-11-18 23:43 UTC (permalink / raw)
  To: marcel, linux-bluetooth
  Cc: chromeos-bluetooth-upstreaming, mcchou, danielwinkler,
	Abhishek Pandit-Subedi, Daniel Winkler, David S. Miller,
	Johan Hedberg, netdev, linux-kernel, Jakub Kicinski

Rename clean_up_hci_state and move to the core header so that we can
power down the controller from within the kernel rather than just via
mgmt commands.

Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Reviewed-by: Daniel Winkler <danielwinkler@google.com>
Reviewed-by: Miao-chen Chou <mcchou@chromium.org>
---

 include/net/bluetooth/hci_core.h |  2 ++
 net/bluetooth/hci_core.c         | 45 +++++++++++++++++++++++++++++++
 net/bluetooth/mgmt.c             | 46 +-------------------------------
 3 files changed, 48 insertions(+), 45 deletions(-)

diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
index 9873e1c8cd163b..ff32d5a856f17f 100644
--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -1072,6 +1072,8 @@ void hci_conn_enter_active_mode(struct hci_conn *conn, __u8 force_active);
 
 void hci_le_conn_failed(struct hci_conn *conn, u8 status);
 
+int hci_clean_up_state(struct hci_dev *hdev);
+
 /*
  * hci_conn_get() and hci_conn_put() are used to control the life-time of an
  * "hci_conn" object. They do not guarantee that the hci_conn object is running,
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 502552d6e9aff3..8e90850d6d769e 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -2290,6 +2290,51 @@ static void hci_power_on(struct work_struct *work)
 	}
 }
 
+static void clean_up_hci_complete(struct hci_dev *hdev, u8 status, u16 opcode)
+{
+	BT_DBG("%s status 0x%02x", hdev->name, status);
+
+	if (hci_conn_count(hdev) == 0) {
+		cancel_delayed_work(&hdev->power_off);
+		queue_work(hdev->req_workqueue, &hdev->power_off.work);
+	}
+}
+
+/* This function requires the caller holds hdev->lock */
+int hci_clean_up_state(struct hci_dev *hdev)
+{
+	struct hci_request req;
+	struct hci_conn *conn;
+	bool discov_stopped;
+	int err;
+	u8 scan = 0x00;
+
+	hci_req_init(&req, hdev);
+
+	if (test_bit(HCI_ISCAN, &hdev->flags) ||
+	    test_bit(HCI_PSCAN, &hdev->flags)) {
+		hci_req_add(&req, HCI_OP_WRITE_SCAN_ENABLE, 1, &scan);
+	}
+
+	hci_req_clear_adv_instance(hdev, NULL, NULL, 0x00, false);
+
+	if (hci_dev_test_flag(hdev, HCI_LE_ADV))
+		__hci_req_disable_advertising(&req);
+
+	discov_stopped = hci_req_stop_discovery(&req);
+
+	list_for_each_entry(conn, &hdev->conn_hash.list, list) {
+		/* 0x15 == Terminated due to Power Off */
+		__hci_abort_conn(&req, conn, 0x15);
+	}
+
+	err = hci_req_run(&req, clean_up_hci_complete);
+	if (!err && discov_stopped)
+		hci_discovery_set_state(hdev, DISCOVERY_STOPPING);
+
+	return err;
+}
+
 static void hci_power_off(struct work_struct *work)
 {
 	struct hci_dev *hdev = container_of(work, struct hci_dev,
diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c
index 12d7b368b428e8..ea136a6b730f9a 100644
--- a/net/bluetooth/mgmt.c
+++ b/net/bluetooth/mgmt.c
@@ -1122,16 +1122,6 @@ static int send_settings_rsp(struct sock *sk, u16 opcode, struct hci_dev *hdev)
 				 sizeof(settings));
 }
 
-static void clean_up_hci_complete(struct hci_dev *hdev, u8 status, u16 opcode)
-{
-	bt_dev_dbg(hdev, "status 0x%02x", status);
-
-	if (hci_conn_count(hdev) == 0) {
-		cancel_delayed_work(&hdev->power_off);
-		queue_work(hdev->req_workqueue, &hdev->power_off.work);
-	}
-}
-
 void mgmt_advertising_added(struct sock *sk, struct hci_dev *hdev, u8 instance)
 {
 	struct mgmt_ev_advertising_added ev;
@@ -1159,40 +1149,6 @@ static void cancel_adv_timeout(struct hci_dev *hdev)
 	}
 }
 
-static int clean_up_hci_state(struct hci_dev *hdev)
-{
-	struct hci_request req;
-	struct hci_conn *conn;
-	bool discov_stopped;
-	int err;
-
-	hci_req_init(&req, hdev);
-
-	if (test_bit(HCI_ISCAN, &hdev->flags) ||
-	    test_bit(HCI_PSCAN, &hdev->flags)) {
-		u8 scan = 0x00;
-		hci_req_add(&req, HCI_OP_WRITE_SCAN_ENABLE, 1, &scan);
-	}
-
-	hci_req_clear_adv_instance(hdev, NULL, NULL, 0x00, false);
-
-	if (hci_dev_test_flag(hdev, HCI_LE_ADV))
-		__hci_req_disable_advertising(&req);
-
-	discov_stopped = hci_req_stop_discovery(&req);
-
-	list_for_each_entry(conn, &hdev->conn_hash.list, list) {
-		/* 0x15 == Terminated due to Power Off */
-		__hci_abort_conn(&req, conn, 0x15);
-	}
-
-	err = hci_req_run(&req, clean_up_hci_complete);
-	if (!err && discov_stopped)
-		hci_discovery_set_state(hdev, DISCOVERY_STOPPING);
-
-	return err;
-}
-
 static int set_powered(struct sock *sk, struct hci_dev *hdev, void *data,
 		       u16 len)
 {
@@ -1230,7 +1186,7 @@ static int set_powered(struct sock *sk, struct hci_dev *hdev, void *data,
 		err = 0;
 	} else {
 		/* Disconnect connections, stop scans, etc */
-		err = clean_up_hci_state(hdev);
+		err = hci_clean_up_state(hdev);
 		if (!err)
 			queue_delayed_work(hdev->req_workqueue, &hdev->power_off,
 					   HCI_POWER_OFF_TIMEOUT);
-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] Bluetooth: Add quirk to power down on suspend
  2020-11-18 23:43 [PATCH 0/3] Bluetooth: Power down controller when suspending Abhishek Pandit-Subedi
  2020-11-18 23:43 ` [PATCH 1/3] Bluetooth: Rename and move clean_up_hci_state Abhishek Pandit-Subedi
@ 2020-11-18 23:43 ` Abhishek Pandit-Subedi
  2020-11-23 11:46 ` [PATCH 0/3] Bluetooth: Power down controller when suspending Marcel Holtmann
  2 siblings, 0 replies; 7+ messages in thread
From: Abhishek Pandit-Subedi @ 2020-11-18 23:43 UTC (permalink / raw)
  To: marcel, linux-bluetooth
  Cc: chromeos-bluetooth-upstreaming, mcchou, danielwinkler,
	Abhishek Pandit-Subedi, David S. Miller, Johan Hedberg, netdev,
	linux-kernel, Jakub Kicinski

Some older controllers fail to enter a quiescent state reliably when
supporting remote wake. For those cases, add a quirk that will power
down the controller when suspending and power it back up when resuming.

Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Reviewed-by: Miao-chen Chou <mcchou@chromium.org>
---

 include/net/bluetooth/hci.h      |  7 +++++
 include/net/bluetooth/hci_core.h |  4 +++
 net/bluetooth/hci_core.c         | 48 ++++++++++++++++++++++++++++++--
 net/bluetooth/hci_request.c      | 26 ++++++++++++++++-
 4 files changed, 82 insertions(+), 3 deletions(-)

diff --git a/include/net/bluetooth/hci.h b/include/net/bluetooth/hci.h
index c8e67042a3b14c..88d5c9554e4840 100644
--- a/include/net/bluetooth/hci.h
+++ b/include/net/bluetooth/hci.h
@@ -238,6 +238,13 @@ enum {
 	 * during the hdev->setup vendor callback.
 	 */
 	HCI_QUIRK_BROKEN_ERR_DATA_REPORTING,
+
+	/* When this quirk is set, the adapter will be powered down during
+	 * system suspend and powerd up on resume. This should be used on
+	 * controllers that don't behave well during suspend, either causing
+	 * spurious wakeups or not entering a suspend state reliably.
+	 */
+	HCI_QUIRK_POWER_DOWN_SYSTEM_SUSPEND,
 };
 
 /* HCI device flags */
diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
index ff32d5a856f17f..e7dc6e3efbf246 100644
--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -90,6 +90,7 @@ struct discovery_state {
 };
 
 #define SUSPEND_NOTIFIER_TIMEOUT	msecs_to_jiffies(2000) /* 2 seconds */
+#define SUSPEND_POWER_DOWN_TIMEOUT	msecs_to_jiffies(1000)
 
 enum suspend_tasks {
 	SUSPEND_PAUSE_DISCOVERY,
@@ -112,6 +113,9 @@ enum suspended_state {
 	BT_RUNNING = 0,
 	BT_SUSPEND_DISCONNECT,
 	BT_SUSPEND_CONFIGURE_WAKE,
+	BT_SUSPEND_DO_POWER_DOWN,
+	BT_SUSPEND_DO_POWER_UP,
+	BT_SUSPEND_POWERED_DOWN,	/* Powered down prior to suspend */
 };
 
 struct hci_conn_hash {
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 8e90850d6d769e..d73e097d3ce16b 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -3562,6 +3562,7 @@ static int hci_suspend_notifier(struct notifier_block *nb, unsigned long action,
 		container_of(nb, struct hci_dev, suspend_notifier);
 	int ret = 0;
 	u8 state = BT_RUNNING;
+	bool powered;
 
 	/* If powering down, wait for completion. */
 	if (mgmt_powering_down(hdev)) {
@@ -3571,8 +3572,51 @@ static int hci_suspend_notifier(struct notifier_block *nb, unsigned long action,
 			goto done;
 	}
 
-	/* Suspend notifier should only act on events when powered. */
-	if (!hdev_is_powered(hdev))
+	powered = hdev_is_powered(hdev);
+
+	/* Update the suspend state when entering suspend if the system is
+	 * currently powered off or if it is powered on but was previously
+	 * powered off.
+	 */
+	if (action == PM_SUSPEND_PREPARE) {
+		/* Must hold dev lock when modifying suspend state. */
+		hci_dev_lock(hdev);
+		if (powered && hdev->suspend_state == BT_SUSPEND_POWERED_DOWN)
+			hdev->suspend_state = BT_RUNNING;
+		else if (!powered &&
+			 hdev->suspend_state != BT_SUSPEND_POWERED_DOWN)
+			hdev->suspend_state = BT_SUSPEND_POWERED_DOWN;
+
+		hci_dev_unlock(hdev);
+	}
+
+	/* When the power down quirk is set, we power down the adapter when
+	 * suspending and power it up when resuming. If the adapter was already
+	 * powered down before suspend, we don't do anything here.
+	 */
+	if (test_bit(HCI_QUIRK_POWER_DOWN_SYSTEM_SUSPEND, &hdev->quirks) &&
+	    hdev->suspend_state != BT_SUSPEND_POWERED_DOWN) {
+		if (action == PM_SUSPEND_PREPARE && powered) {
+			state = BT_SUSPEND_DO_POWER_DOWN;
+			ret = hci_change_suspend_state(hdev, state);
+
+			/* Emit that we're powering down for suspend */
+			hci_clear_wake_reason(hdev);
+			mgmt_suspending(hdev, state);
+			goto done;
+		} else if (action == PM_POST_SUSPEND && !powered) {
+			/* Emit that we're resuming before powering up. */
+			mgmt_resuming(hdev, hdev->wake_reason, &hdev->wake_addr,
+				      hdev->wake_addr_type);
+
+			state = BT_SUSPEND_DO_POWER_UP;
+			ret = hci_change_suspend_state(hdev, state);
+			goto done;
+		}
+	}
+
+	/* Skip to end if we weren't powered. */
+	if (!powered)
 		goto done;
 
 	if (action == PM_SUSPEND_PREPARE) {
diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
index 048d4db9d4ea53..804bd0652edd1c 100644
--- a/net/bluetooth/hci_request.c
+++ b/net/bluetooth/hci_request.c
@@ -1194,6 +1194,7 @@ void hci_req_prepare_suspend(struct hci_dev *hdev, enum suspended_state next)
 	struct hci_request req;
 	u8 page_scan;
 	int disconnect_counter;
+	int err;
 
 	if (next == hdev->suspend_state) {
 		bt_dev_dbg(hdev, "Same state before and after: %d", next);
@@ -1273,7 +1274,7 @@ void hci_req_prepare_suspend(struct hci_dev *hdev, enum suspended_state next)
 		/* Pause scan changes again. */
 		hdev->scanning_paused = true;
 		hci_req_run(&req, suspend_req_complete);
-	} else {
+	} else if (next == BT_RUNNING) {
 		hdev->suspended = false;
 		hdev->scanning_paused = false;
 
@@ -1306,6 +1307,29 @@ void hci_req_prepare_suspend(struct hci_dev *hdev, enum suspended_state next)
 		}
 
 		hci_req_run(&req, suspend_req_complete);
+	} else if (next == BT_SUSPEND_DO_POWER_DOWN) {
+		hdev->suspended = true;
+		hdev->scanning_paused = true;
+
+		err = hci_clean_up_state(hdev);
+
+		if (!err)
+			queue_delayed_work(hdev->req_workqueue,
+					   &hdev->power_off,
+					   SUSPEND_POWER_DOWN_TIMEOUT);
+
+		if (err == -ENODATA) {
+			cancel_delayed_work(&hdev->power_off);
+			queue_work(hdev->req_workqueue, &hdev->power_off.work);
+			err = 0;
+		}
+
+		set_bit(SUSPEND_POWERING_DOWN, hdev->suspend_tasks);
+	} else if (next == BT_SUSPEND_DO_POWER_UP) {
+		hdev->suspended = false;
+		hdev->scanning_paused = false;
+
+		queue_work(hdev->req_workqueue, &hdev->power_on);
 	}
 
 	hdev->suspend_state = next;
-- 
2.29.2.299.gdc1121823c-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] Bluetooth: Power down controller when suspending
  2020-11-18 23:43 [PATCH 0/3] Bluetooth: Power down controller when suspending Abhishek Pandit-Subedi
  2020-11-18 23:43 ` [PATCH 1/3] Bluetooth: Rename and move clean_up_hci_state Abhishek Pandit-Subedi
  2020-11-18 23:43 ` [PATCH 2/3] Bluetooth: Add quirk to power down on suspend Abhishek Pandit-Subedi
@ 2020-11-23 11:46 ` Marcel Holtmann
  2020-11-24 19:02   ` Abhishek Pandit-Subedi
  2 siblings, 1 reply; 7+ messages in thread
From: Marcel Holtmann @ 2020-11-23 11:46 UTC (permalink / raw)
  To: Abhishek Pandit-Subedi
  Cc: BlueZ development, chromeos-bluetooth-upstreaming, mcchou,
	danielwinkler, David S. Miller, Johan Hedberg, netdev,
	linux-kernel, Jakub Kicinski

Hi Abhishek,

> This patch series adds support for a quirk that will power down the
> Bluetooth controller when suspending and power it back up when resuming.
> 
> On Marvell SDIO Bluetooth controllers (SD8897 and SD8997), we are seeing
> a large number of suspend failures with the following log messages:
> 
> [ 4764.773873] Bluetooth: hci_cmd_timeout() hci0 command 0x0c14 tx timeout
> [ 4767.777897] Bluetooth: btmrvl_enable_hs() Host sleep enable command failed
> [ 4767.777920] Bluetooth: btmrvl_sdio_suspend() HS not actived, suspend failed!
> [ 4767.777946] dpm_run_callback(): pm_generic_suspend+0x0/0x48 returns -16
> [ 4767.777963] call mmc2:0001:2+ returned -16 after 4882288 usecs
> 
> The daily failure rate with this signature is quite significant and
> users are likely facing this at least once a day (and some unlucky users
> are likely facing it multiple times a day).
> 
> Given the severity, we'd like to power off the controller during suspend
> so the driver doesn't need to take any action (or block in any way) when
> suspending and power on during resume. This will break wake-on-bt for
> users but should improve the reliability of suspend.
> 
> We don't want to force all users of MVL8897 and MVL8997 to encounter
> this behavior if they're not affected (especially users that depend on
> Bluetooth for keyboard/mouse input) so the new behavior is enabled via
> module param. We are limiting this quirk to only Chromebooks (i.e.
> laptop). Chromeboxes will continue to have the old behavior since users
> may depend on BT HID to wake and use the system.

I don’t have a super great feeling with this change.

So historically only hciconfig hci0 up/down was doing a power cycle of the controller and when adding the mgmt interface we moved that to the mgmt interface. In addition we added a special case of power up via hdev->setup. We never had an intention that the kernel otherwise can power up/down the controller as it pleases.

Can we ask Marvell first to investigate why this is fundamentally broken with their hardware? Since what you are proposing is a pretty heavy change that might has side affects. For example the state machine for the mgmt interface has no concept of a power down/up from the kernel. It is all triggered by bluetoothd.

I am careful here since the whole power up/down path is already complicated enough.

Regards

Marcel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] Bluetooth: Power down controller when suspending
  2020-11-23 11:46 ` [PATCH 0/3] Bluetooth: Power down controller when suspending Marcel Holtmann
@ 2020-11-24 19:02   ` Abhishek Pandit-Subedi
  2020-11-24 19:04     ` Abhishek Pandit-Subedi
  2020-11-25 13:47     ` Marcel Holtmann
  0 siblings, 2 replies; 7+ messages in thread
From: Abhishek Pandit-Subedi @ 2020-11-24 19:02 UTC (permalink / raw)
  To: Marcel Holtmann, crlo, akarwar
  Cc: BlueZ development, ChromeOS Bluetooth Upstreaming,
	Miao-chen Chou, Daniel Winkler, David S. Miller, Johan Hedberg,
	netdev, LKML, Jakub Kicinski

Hi Marcel,


On Mon, Nov 23, 2020 at 3:46 AM Marcel Holtmann <marcel@holtmann.org> wrote:
>
> Hi Abhishek,
>
> > This patch series adds support for a quirk that will power down the
> > Bluetooth controller when suspending and power it back up when resuming.
> >
> > On Marvell SDIO Bluetooth controllers (SD8897 and SD8997), we are seeing
> > a large number of suspend failures with the following log messages:
> >
> > [ 4764.773873] Bluetooth: hci_cmd_timeout() hci0 command 0x0c14 tx timeout
> > [ 4767.777897] Bluetooth: btmrvl_enable_hs() Host sleep enable command failed
> > [ 4767.777920] Bluetooth: btmrvl_sdio_suspend() HS not actived, suspend failed!
> > [ 4767.777946] dpm_run_callback(): pm_generic_suspend+0x0/0x48 returns -16
> > [ 4767.777963] call mmc2:0001:2+ returned -16 after 4882288 usecs
> >
> > The daily failure rate with this signature is quite significant and
> > users are likely facing this at least once a day (and some unlucky users
> > are likely facing it multiple times a day).
> >
> > Given the severity, we'd like to power off the controller during suspend
> > so the driver doesn't need to take any action (or block in any way) when
> > suspending and power on during resume. This will break wake-on-bt for
> > users but should improve the reliability of suspend.
> >
> > We don't want to force all users of MVL8897 and MVL8997 to encounter
> > this behavior if they're not affected (especially users that depend on
> > Bluetooth for keyboard/mouse input) so the new behavior is enabled via
> > module param. We are limiting this quirk to only Chromebooks (i.e.
> > laptop). Chromeboxes will continue to have the old behavior since users
> > may depend on BT HID to wake and use the system.
>
> I don’t have a super great feeling with this change.
>
> So historically only hciconfig hci0 up/down was doing a power cycle of the controller and when adding the mgmt interface we moved that to the mgmt interface. In addition we added a special case of power up via hdev->setup. We never had an intention that the kernel otherwise can power up/down the controller as it pleases.

Aside from the powered setting, the stack is resilient to the
controller crashing (which would be akin to a power off and power on).
From the view of bluez, adapter lost and power down should be almost
equivalent right? ChromeOS has several platforms where Bluetooth has
been reset after suspend, usually due USB being powered off in S3, and
the stack is still well-behaving when that occurs.

>
> Can we ask Marvell first to investigate why this is fundamentally broken with their hardware?

+Chin-Ran Lo and +Amitkumar Karwar (added based on changes to
drivers/bluetooth/btmrvl_main.c)

Could you please take a look at the original cover letter and comment
(or add others at Marvell who may be able to)? Is this a known issue
or a fix?

>Since what you are proposing is a pretty heavy change that might has side affects. For example the state machine for the mgmt interface has no concept of a power down/up from the kernel. It is all triggered by bluetoothd.
>
> I am careful here since the whole power up/down path is already complicated enough.
>

That sounds reasonable. I have landed this within ChromeOS so we can
test whether a) this improves stability enough and b) whether the
power off/on in the kernel has significant side effects. This will go
through our automated testing and dogfooding over the next few weeks
and hopefully identify those side-effects. I will re-raise this topic
with updates once we have more data.

Also, in case it wasn't very clear, I put this behind a module param
that defaults to False because this is so heavy handed. We're only
using it on specific Chromebooks that are exhibiting the worst
behavior and not disabling it wholesale for all btmrvl controllers.

Thanks
Abhishek

> Regards
>
> Marcel
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] Bluetooth: Power down controller when suspending
  2020-11-24 19:02   ` Abhishek Pandit-Subedi
@ 2020-11-24 19:04     ` Abhishek Pandit-Subedi
  2020-11-25 13:47     ` Marcel Holtmann
  1 sibling, 0 replies; 7+ messages in thread
From: Abhishek Pandit-Subedi @ 2020-11-24 19:04 UTC (permalink / raw)
  To: Marcel Holtmann, chin-ran.lo, amitkumar.karwar
  Cc: BlueZ development, ChromeOS Bluetooth Upstreaming,
	Miao-chen Chou, Daniel Winkler, David S. Miller, Johan Hedberg,
	netdev, LKML, Jakub Kicinski

Re-send to NXP email addresses for Chin-Ran Lo and Amitkumar Karwar
(Marvell wireless IP acquired by NXP)



On Tue, Nov 24, 2020 at 11:02 AM Abhishek Pandit-Subedi
<abhishekpandit@chromium.org> wrote:
>
> Hi Marcel,
>
>
> On Mon, Nov 23, 2020 at 3:46 AM Marcel Holtmann <marcel@holtmann.org> wrote:
> >
> > Hi Abhishek,
> >
> > > This patch series adds support for a quirk that will power down the
> > > Bluetooth controller when suspending and power it back up when resuming.
> > >
> > > On Marvell SDIO Bluetooth controllers (SD8897 and SD8997), we are seeing
> > > a large number of suspend failures with the following log messages:
> > >
> > > [ 4764.773873] Bluetooth: hci_cmd_timeout() hci0 command 0x0c14 tx timeout
> > > [ 4767.777897] Bluetooth: btmrvl_enable_hs() Host sleep enable command failed
> > > [ 4767.777920] Bluetooth: btmrvl_sdio_suspend() HS not actived, suspend failed!
> > > [ 4767.777946] dpm_run_callback(): pm_generic_suspend+0x0/0x48 returns -16
> > > [ 4767.777963] call mmc2:0001:2+ returned -16 after 4882288 usecs
> > >
> > > The daily failure rate with this signature is quite significant and
> > > users are likely facing this at least once a day (and some unlucky users
> > > are likely facing it multiple times a day).
> > >
> > > Given the severity, we'd like to power off the controller during suspend
> > > so the driver doesn't need to take any action (or block in any way) when
> > > suspending and power on during resume. This will break wake-on-bt for
> > > users but should improve the reliability of suspend.
> > >
> > > We don't want to force all users of MVL8897 and MVL8997 to encounter
> > > this behavior if they're not affected (especially users that depend on
> > > Bluetooth for keyboard/mouse input) so the new behavior is enabled via
> > > module param. We are limiting this quirk to only Chromebooks (i.e.
> > > laptop). Chromeboxes will continue to have the old behavior since users
> > > may depend on BT HID to wake and use the system.
> >
> > I don’t have a super great feeling with this change.
> >
> > So historically only hciconfig hci0 up/down was doing a power cycle of the controller and when adding the mgmt interface we moved that to the mgmt interface. In addition we added a special case of power up via hdev->setup. We never had an intention that the kernel otherwise can power up/down the controller as it pleases.
>
> Aside from the powered setting, the stack is resilient to the
> controller crashing (which would be akin to a power off and power on).
> From the view of bluez, adapter lost and power down should be almost
> equivalent right? ChromeOS has several platforms where Bluetooth has
> been reset after suspend, usually due USB being powered off in S3, and
> the stack is still well-behaving when that occurs.
>
> >
> > Can we ask Marvell first to investigate why this is fundamentally broken with their hardware?
>
> +Chin-Ran Lo and +Amitkumar Karwar (added based on changes to
> drivers/bluetooth/btmrvl_main.c)
>
> Could you please take a look at the original cover letter and comment
> (or add others at Marvell who may be able to)? Is this a known issue
> or a fix?
>
> >Since what you are proposing is a pretty heavy change that might has side affects. For example the state machine for the mgmt interface has no concept of a power down/up from the kernel. It is all triggered by bluetoothd.
> >
> > I am careful here since the whole power up/down path is already complicated enough.
> >
>
> That sounds reasonable. I have landed this within ChromeOS so we can
> test whether a) this improves stability enough and b) whether the
> power off/on in the kernel has significant side effects. This will go
> through our automated testing and dogfooding over the next few weeks
> and hopefully identify those side-effects. I will re-raise this topic
> with updates once we have more data.
>
> Also, in case it wasn't very clear, I put this behind a module param
> that defaults to False because this is so heavy handed. We're only
> using it on specific Chromebooks that are exhibiting the worst
> behavior and not disabling it wholesale for all btmrvl controllers.
>
> Thanks
> Abhishek
>
> > Regards
> >
> > Marcel
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/3] Bluetooth: Power down controller when suspending
  2020-11-24 19:02   ` Abhishek Pandit-Subedi
  2020-11-24 19:04     ` Abhishek Pandit-Subedi
@ 2020-11-25 13:47     ` Marcel Holtmann
  1 sibling, 0 replies; 7+ messages in thread
From: Marcel Holtmann @ 2020-11-25 13:47 UTC (permalink / raw)
  To: Abhishek Pandit-Subedi
  Cc: crlo, akarwar, BlueZ development, ChromeOS Bluetooth Upstreaming,
	Miao-chen Chou, Daniel Winkler, David S. Miller, Johan Hedberg,
	netdev, LKML, Jakub Kicinski

Hi Abhishek,

>>> This patch series adds support for a quirk that will power down the
>>> Bluetooth controller when suspending and power it back up when resuming.
>>> 
>>> On Marvell SDIO Bluetooth controllers (SD8897 and SD8997), we are seeing
>>> a large number of suspend failures with the following log messages:
>>> 
>>> [ 4764.773873] Bluetooth: hci_cmd_timeout() hci0 command 0x0c14 tx timeout
>>> [ 4767.777897] Bluetooth: btmrvl_enable_hs() Host sleep enable command failed
>>> [ 4767.777920] Bluetooth: btmrvl_sdio_suspend() HS not actived, suspend failed!
>>> [ 4767.777946] dpm_run_callback(): pm_generic_suspend+0x0/0x48 returns -16
>>> [ 4767.777963] call mmc2:0001:2+ returned -16 after 4882288 usecs
>>> 
>>> The daily failure rate with this signature is quite significant and
>>> users are likely facing this at least once a day (and some unlucky users
>>> are likely facing it multiple times a day).
>>> 
>>> Given the severity, we'd like to power off the controller during suspend
>>> so the driver doesn't need to take any action (or block in any way) when
>>> suspending and power on during resume. This will break wake-on-bt for
>>> users but should improve the reliability of suspend.
>>> 
>>> We don't want to force all users of MVL8897 and MVL8997 to encounter
>>> this behavior if they're not affected (especially users that depend on
>>> Bluetooth for keyboard/mouse input) so the new behavior is enabled via
>>> module param. We are limiting this quirk to only Chromebooks (i.e.
>>> laptop). Chromeboxes will continue to have the old behavior since users
>>> may depend on BT HID to wake and use the system.
>> 
>> I don’t have a super great feeling with this change.
>> 
>> So historically only hciconfig hci0 up/down was doing a power cycle of the controller and when adding the mgmt interface we moved that to the mgmt interface. In addition we added a special case of power up via hdev->setup. We never had an intention that the kernel otherwise can power up/down the controller as it pleases.
> 
> Aside from the powered setting, the stack is resilient to the
> controller crashing (which would be akin to a power off and power on).
> From the view of bluez, adapter lost and power down should be almost
> equivalent right? ChromeOS has several platforms where Bluetooth has
> been reset after suspend, usually due USB being powered off in S3, and
> the stack is still well-behaving when that occurs.

it gets multitudes more complicated if you look at HCI User Channel and other pieces that utilize the core HCI infrastructure.

HCI interface lost, because of USB disconnect is different. That is a clean path. Similar to RFKILL that just only does a power down.

>> Can we ask Marvell first to investigate why this is fundamentally broken with their hardware?
> 
> +Chin-Ran Lo and +Amitkumar Karwar (added based on changes to
> drivers/bluetooth/btmrvl_main.c)
> 
> Could you please take a look at the original cover letter and comment
> (or add others at Marvell who may be able to)? Is this a known issue
> or a fix?

I wonder if sending a HCI Reset at before entering suspend would be enough. Meaning clear all controller states first and then suspend. This will still disable any kind of remote wakeup support, but might avoid having to fully power down and power up again.

>> Since what you are proposing is a pretty heavy change that might has side affects. For example the state machine for the mgmt interface has no concept of a power down/up from the kernel. It is all triggered by bluetoothd.
>> 
>> I am careful here since the whole power up/down path is already complicated enough.
>> 
> 
> That sounds reasonable. I have landed this within ChromeOS so we can
> test whether a) this improves stability enough and b) whether the
> power off/on in the kernel has significant side effects. This will go
> through our automated testing and dogfooding over the next few weeks
> and hopefully identify those side-effects. I will re-raise this topic
> with updates once we have more data.
> 
> Also, in case it wasn't very clear, I put this behind a module param
> that defaults to False because this is so heavy handed. We're only
> using it on specific Chromebooks that are exhibiting the worst
> behavior and not disabling it wholesale for all btmrvl controllers.

We really need a conformance hci-tester that checks if a controller is behaving correctly as promised.

Regards

Marcel


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-11-25 13:47 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-18 23:43 [PATCH 0/3] Bluetooth: Power down controller when suspending Abhishek Pandit-Subedi
2020-11-18 23:43 ` [PATCH 1/3] Bluetooth: Rename and move clean_up_hci_state Abhishek Pandit-Subedi
2020-11-18 23:43 ` [PATCH 2/3] Bluetooth: Add quirk to power down on suspend Abhishek Pandit-Subedi
2020-11-23 11:46 ` [PATCH 0/3] Bluetooth: Power down controller when suspending Marcel Holtmann
2020-11-24 19:02   ` Abhishek Pandit-Subedi
2020-11-24 19:04     ` Abhishek Pandit-Subedi
2020-11-25 13:47     ` Marcel Holtmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).