All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1] Bluetooth: Fix race condition in handling NOP command
@ 2021-08-04 17:39 Kiran K
  2021-08-04 18:13 ` [v1] " bluez.test.bot
  2021-08-05 13:11 ` [PATCH v1] " Marcel Holtmann
  0 siblings, 2 replies; 7+ messages in thread
From: Kiran K @ 2021-08-04 17:39 UTC (permalink / raw)
  To: linux-bluetooth; +Cc: ravishankar.srivatsa, chethan.tumkur.narayan, Kiran K

For NOP command, need to cancel work scheduled on cmd_timer,
on receiving command status or commmand complete event.

Below use case might lead to race condition multiple when NOP
commands are queued sequentially:

hci_cmd_work() {
   if (atomic_read(&hdev->cmd_cnt) {
            .
            .
            .
      atomic_dec(&hdev->cmd_cnt);
      hci_send_frame(hdev,...);
      schedule_delayed_work(&hdev->cmd_timer,...);
   }
}

On receiving event for first NOP, the work scheduled on hdev->cmd_timer
is not cancelled and  second NOP is dequeued and sent to controller.

While waiting for an event for second NOP command, work scheduled on
cmd_timer for first NOP can get scheduled, resulting in sending third
NOP command not waiting for an event for second NOP. This might cause
issues at controller side (like memory overrun, controller going
unresponsive) resulting in hci tx timeouts, hardware errors etc.

Signed-off-by: Kiran K <kiran.k@intel.com>
Reviewed-by: Chethan T N <chethan.tumkur.narayan@intel.com>
Reviewed-by: Srivatsa Ravishankar <ravishankar.srivatsa@intel.com>
---
 net/bluetooth/hci_event.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
index ea7fc09478be..14dfbdc8b81b 100644
--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -3271,8 +3271,7 @@ static void hci_remote_features_evt(struct hci_dev *hdev,
 static inline void handle_cmd_cnt_and_timer(struct hci_dev *hdev,
 					    u16 opcode, u8 ncmd)
 {
-	if (opcode != HCI_OP_NOP)
-		cancel_delayed_work(&hdev->cmd_timer);
+	cancel_delayed_work(&hdev->cmd_timer);
 
 	if (!test_bit(HCI_RESET, &hdev->flags)) {
 		if (ncmd) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* RE: [v1] Bluetooth: Fix race condition in handling NOP command
  2021-08-04 17:39 [PATCH v1] Bluetooth: Fix race condition in handling NOP command Kiran K
@ 2021-08-04 18:13 ` bluez.test.bot
  2021-08-05 13:11 ` [PATCH v1] " Marcel Holtmann
  1 sibling, 0 replies; 7+ messages in thread
From: bluez.test.bot @ 2021-08-04 18:13 UTC (permalink / raw)
  To: linux-bluetooth, kiran.k

[-- Attachment #1: Type: text/plain, Size: 2727 bytes --]

This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=526461

---Test result---

Test Summary:
CheckPatch                    PASS      0.48 seconds
GitLint                       PASS      0.11 seconds
BuildKernel                   PASS      573.83 seconds
TestRunner: Setup             PASS      379.63 seconds
TestRunner: l2cap-tester      PASS      2.70 seconds
TestRunner: bnep-tester       PASS      2.03 seconds
TestRunner: mgmt-tester       PASS      30.92 seconds
TestRunner: rfcomm-tester     PASS      2.40 seconds
TestRunner: sco-tester        PASS      2.23 seconds
TestRunner: smp-tester        FAIL      2.27 seconds
TestRunner: userchan-tester   PASS      2.18 seconds

Details
##############################
Test: CheckPatch - PASS - 0.48 seconds
Run checkpatch.pl script with rule in .checkpatch.conf


##############################
Test: GitLint - PASS - 0.11 seconds
Run gitlint with rule in .gitlint


##############################
Test: BuildKernel - PASS - 573.83 seconds
Build Kernel with minimal configuration supports Bluetooth


##############################
Test: TestRunner: Setup - PASS - 379.63 seconds
Setup environment for running Test Runner


##############################
Test: TestRunner: l2cap-tester - PASS - 2.70 seconds
Run test-runner with l2cap-tester
Total: 40, Passed: 40 (100.0%), Failed: 0, Not Run: 0

##############################
Test: TestRunner: bnep-tester - PASS - 2.03 seconds
Run test-runner with bnep-tester
Total: 1, Passed: 1 (100.0%), Failed: 0, Not Run: 0

##############################
Test: TestRunner: mgmt-tester - PASS - 30.92 seconds
Run test-runner with mgmt-tester
Total: 448, Passed: 445 (99.3%), Failed: 0, Not Run: 3

##############################
Test: TestRunner: rfcomm-tester - PASS - 2.40 seconds
Run test-runner with rfcomm-tester
Total: 9, Passed: 9 (100.0%), Failed: 0, Not Run: 0

##############################
Test: TestRunner: sco-tester - PASS - 2.23 seconds
Run test-runner with sco-tester
Total: 8, Passed: 8 (100.0%), Failed: 0, Not Run: 0

##############################
Test: TestRunner: smp-tester - FAIL - 2.27 seconds
Run test-runner with smp-tester
Total: 8, Passed: 7 (87.5%), Failed: 1, Not Run: 0

Failed Test Cases
SMP Client - SC Request 2                            Failed       0.027 seconds

##############################
Test: TestRunner: userchan-tester - PASS - 2.18 seconds
Run test-runner with userchan-tester
Total: 3, Passed: 3 (100.0%), Failed: 0, Not Run: 0



---
Regards,
Linux Bluetooth


[-- Attachment #2: l2cap-tester.log --]
[-- Type: application/octet-stream, Size: 44384 bytes --]

[-- Attachment #3: bnep-tester.log --]
[-- Type: application/octet-stream, Size: 3592 bytes --]

[-- Attachment #4: mgmt-tester.log --]
[-- Type: application/octet-stream, Size: 616862 bytes --]

[-- Attachment #5: rfcomm-tester.log --]
[-- Type: application/octet-stream, Size: 11712 bytes --]

[-- Attachment #6: sco-tester.log --]
[-- Type: application/octet-stream, Size: 9947 bytes --]

[-- Attachment #7: smp-tester.log --]
[-- Type: application/octet-stream, Size: 11740 bytes --]

[-- Attachment #8: userchan-tester.log --]
[-- Type: application/octet-stream, Size: 5488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] Bluetooth: Fix race condition in handling NOP command
  2021-08-04 17:39 [PATCH v1] Bluetooth: Fix race condition in handling NOP command Kiran K
  2021-08-04 18:13 ` [v1] " bluez.test.bot
@ 2021-08-05 13:11 ` Marcel Holtmann
  2021-08-06 14:44   ` K, Kiran
  1 sibling, 1 reply; 7+ messages in thread
From: Marcel Holtmann @ 2021-08-05 13:11 UTC (permalink / raw)
  To: Kiran K; +Cc: BlueZ, ravishankar.srivatsa, chethan.tumkur.narayan

Hi Kiran,

> For NOP command, need to cancel work scheduled on cmd_timer,
> on receiving command status or commmand complete event.
> 
> Below use case might lead to race condition multiple when NOP
> commands are queued sequentially:
> 
> hci_cmd_work() {
>   if (atomic_read(&hdev->cmd_cnt) {
>            .
>            .
>            .
>      atomic_dec(&hdev->cmd_cnt);
>      hci_send_frame(hdev,...);
>      schedule_delayed_work(&hdev->cmd_timer,...);
>   }
> }
> 
> On receiving event for first NOP, the work scheduled on hdev->cmd_timer
> is not cancelled and  second NOP is dequeued and sent to controller.
> 
> While waiting for an event for second NOP command, work scheduled on
> cmd_timer for first NOP can get scheduled, resulting in sending third
> NOP command not waiting for an event for second NOP. This might cause
> issues at controller side (like memory overrun, controller going
> unresponsive) resulting in hci tx timeouts, hardware errors etc.
> 
> Signed-off-by: Kiran K <kiran.k@intel.com>
> Reviewed-by: Chethan T N <chethan.tumkur.narayan@intel.com>
> Reviewed-by: Srivatsa Ravishankar <ravishankar.srivatsa@intel.com>
> ---
> net/bluetooth/hci_event.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> index ea7fc09478be..14dfbdc8b81b 100644
> --- a/net/bluetooth/hci_event.c
> +++ b/net/bluetooth/hci_event.c
> @@ -3271,8 +3271,7 @@ static void hci_remote_features_evt(struct hci_dev *hdev,
> static inline void handle_cmd_cnt_and_timer(struct hci_dev *hdev,
> 					    u16 opcode, u8 ncmd)
> {
> -	if (opcode != HCI_OP_NOP)
> -		cancel_delayed_work(&hdev->cmd_timer);
> +	cancel_delayed_work(&hdev->cmd_timer);
> 
> 	if (!test_bit(HCI_RESET, &hdev->flags)) {
> 		if (ncmd) {

so this is conflicting with the patch introducing the ncmd timeout handling.

commit de75cd0d9b2f3250d5f25846bb5632ccce6275f4
Author: Manish Mandlik <mmandlik@google.com>
Date:   Thu Apr 29 10:24:22 2021 -0700

    Bluetooth: Add ncmd=0 recovery handling
    
    During command status or command complete event, the controller may set
    ncmd=0 indicating that it is not accepting any more commands. In such a
    case, host holds off sending any more commands to the controller. If the
    controller doesn't recover from such condition, host will wait forever,
    until the user decides that the Bluetooth is broken and may power cycles
    the Bluetooth.
    
    This patch triggers the hardware error to reset the controller and
    driver when it gets into such state as there is no other wat out.

Nowhere in your commit description you are addressing why is this the right to do.

Regards

Marcel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH v1] Bluetooth: Fix race condition in handling NOP command
  2021-08-05 13:11 ` [PATCH v1] " Marcel Holtmann
@ 2021-08-06 14:44   ` K, Kiran
  2021-08-12 10:55     ` K, Kiran
  0 siblings, 1 reply; 7+ messages in thread
From: K, Kiran @ 2021-08-06 14:44 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: BlueZ, Srivatsa, Ravishankar, Tumkur Narayan, Chethan

Hi Marcel,

> -----Original Message-----
> From: Marcel Holtmann <marcel@holtmann.org>
> Sent: Thursday, August 5, 2021 6:41 PM
> To: K, Kiran <kiran.k@intel.com>
> Cc: BlueZ <linux-bluetooth@vger.kernel.org>; Srivatsa, Ravishankar
> <ravishankar.srivatsa@intel.com>; Tumkur Narayan, Chethan
> <chethan.tumkur.narayan@intel.com>
> Subject: Re: [PATCH v1] Bluetooth: Fix race condition in handling NOP
> command
> 
> Hi Kiran,
> 
> > For NOP command, need to cancel work scheduled on cmd_timer, on
> > receiving command status or commmand complete event.
> >
> > Below use case might lead to race condition multiple when NOP commands
> > are queued sequentially:
> >
> > hci_cmd_work() {
> >   if (atomic_read(&hdev->cmd_cnt) {
> >            .
> >            .
> >            .
> >      atomic_dec(&hdev->cmd_cnt);
> >      hci_send_frame(hdev,...);
> >      schedule_delayed_work(&hdev->cmd_timer,...);
> >   }
> > }
> >
> > On receiving event for first NOP, the work scheduled on
> > hdev->cmd_timer is not cancelled and  second NOP is dequeued and sent
> to controller.
> >
> > While waiting for an event for second NOP command, work scheduled on
> > cmd_timer for first NOP can get scheduled, resulting in sending third
> > NOP command not waiting for an event for second NOP. This might cause
> > issues at controller side (like memory overrun, controller going
> > unresponsive) resulting in hci tx timeouts, hardware errors etc.
> >
> > Signed-off-by: Kiran K <kiran.k@intel.com>
> > Reviewed-by: Chethan T N <chethan.tumkur.narayan@intel.com>
> > Reviewed-by: Srivatsa Ravishankar <ravishankar.srivatsa@intel.com>
> > ---
> > net/bluetooth/hci_event.c | 3 +--
> > 1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> > index ea7fc09478be..14dfbdc8b81b 100644
> > --- a/net/bluetooth/hci_event.c
> > +++ b/net/bluetooth/hci_event.c
> > @@ -3271,8 +3271,7 @@ static void hci_remote_features_evt(struct
> > hci_dev *hdev, static inline void handle_cmd_cnt_and_timer(struct hci_dev
> *hdev,
> > 					    u16 opcode, u8 ncmd)
> > {
> > -	if (opcode != HCI_OP_NOP)
> > -		cancel_delayed_work(&hdev->cmd_timer);
> > +	cancel_delayed_work(&hdev->cmd_timer);
> >
> > 	if (!test_bit(HCI_RESET, &hdev->flags)) {
> > 		if (ncmd) {
> 
> so this is conflicting with the patch introducing the ncmd timeout handling.
> 
My patch specifically addresses the issue observed in case of NOP command. It prevents the issue by handling NOP same as any other SIG command.

It looks commit de75cd0d9b2f3250d5f25846bb5632ccce6275f4 tries to recover when controller goes bad.
  
> commit de75cd0d9b2f3250d5f25846bb5632ccce6275f4
> Author: Manish Mandlik <mmandlik@google.com>
> Date:   Thu Apr 29 10:24:22 2021 -0700
> 
>     Bluetooth: Add ncmd=0 recovery handling
> 
>     During command status or command complete event, the controller may
> set
>     ncmd=0 indicating that it is not accepting any more commands. In such a
>     case, host holds off sending any more commands to the controller. If the
>     controller doesn't recover from such condition, host will wait forever,
>     until the user decides that the Bluetooth is broken and may power cycles
>     the Bluetooth.
> 
>     This patch triggers the hardware error to reset the controller and
>     driver when it gets into such state as there is no other wat out.
> 
> Nowhere in your commit description you are addressing why is this the right
> to do.
>

Will fix it in the next version if you are OK with the current fix. Please let me know.

> Regards
> 
> Marcel

Thanks,
Kiran


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH v1] Bluetooth: Fix race condition in handling NOP command
  2021-08-06 14:44   ` K, Kiran
@ 2021-08-12 10:55     ` K, Kiran
  2021-08-12 17:31       ` Luiz Augusto von Dentz
  0 siblings, 1 reply; 7+ messages in thread
From: K, Kiran @ 2021-08-12 10:55 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: BlueZ, Srivatsa, Ravishankar, Tumkur Narayan, Chethan

Hi Marcel,

> -----Original Message-----
> From: K, Kiran
> Sent: Friday, August 6, 2021 8:14 PM
> To: 'Marcel Holtmann' <marcel@holtmann.org>
> Cc: BlueZ <linux-bluetooth@vger.kernel.org>; Srivatsa, Ravishankar
> <ravishankar.srivatsa@intel.com>; Tumkur Narayan, Chethan
> <chethan.tumkur.narayan@intel.com>
> Subject: RE: [PATCH v1] Bluetooth: Fix race condition in handling NOP
> command
> 
> Hi Marcel,
> 
> > -----Original Message-----
> > From: Marcel Holtmann <marcel@holtmann.org>
> > Sent: Thursday, August 5, 2021 6:41 PM
> > To: K, Kiran <kiran.k@intel.com>
> > Cc: BlueZ <linux-bluetooth@vger.kernel.org>; Srivatsa, Ravishankar
> > <ravishankar.srivatsa@intel.com>; Tumkur Narayan, Chethan
> > <chethan.tumkur.narayan@intel.com>
> > Subject: Re: [PATCH v1] Bluetooth: Fix race condition in handling NOP
> > command
> >
> > Hi Kiran,
> >
> > > For NOP command, need to cancel work scheduled on cmd_timer, on
> > > receiving command status or commmand complete event.
> > >
> > > Below use case might lead to race condition multiple when NOP
> > > commands are queued sequentially:
> > >
> > > hci_cmd_work() {
> > >   if (atomic_read(&hdev->cmd_cnt) {
> > >            .
> > >            .
> > >            .
> > >      atomic_dec(&hdev->cmd_cnt);
> > >      hci_send_frame(hdev,...);
> > >      schedule_delayed_work(&hdev->cmd_timer,...);
> > >   }
> > > }
> > >
> > > On receiving event for first NOP, the work scheduled on
> > > hdev->cmd_timer is not cancelled and  second NOP is dequeued and
> > > hdev->sent
> > to controller.
> > >
> > > While waiting for an event for second NOP command, work scheduled on
> > > cmd_timer for first NOP can get scheduled, resulting in sending
> > > third NOP command not waiting for an event for second NOP. This
> > > might cause issues at controller side (like memory overrun,
> > > controller going
> > > unresponsive) resulting in hci tx timeouts, hardware errors etc.
> > >
> > > Signed-off-by: Kiran K <kiran.k@intel.com>
> > > Reviewed-by: Chethan T N <chethan.tumkur.narayan@intel.com>
> > > Reviewed-by: Srivatsa Ravishankar <ravishankar.srivatsa@intel.com>
> > > ---
> > > net/bluetooth/hci_event.c | 3 +--
> > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > >
> > > diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> > > index ea7fc09478be..14dfbdc8b81b 100644
> > > --- a/net/bluetooth/hci_event.c
> > > +++ b/net/bluetooth/hci_event.c
> > > @@ -3271,8 +3271,7 @@ static void hci_remote_features_evt(struct
> > > hci_dev *hdev, static inline void handle_cmd_cnt_and_timer(struct
> > > hci_dev
> > *hdev,
> > > 					    u16 opcode, u8 ncmd)
> > > {
> > > -	if (opcode != HCI_OP_NOP)
> > > -		cancel_delayed_work(&hdev->cmd_timer);
> > > +	cancel_delayed_work(&hdev->cmd_timer);
> > >
> > > 	if (!test_bit(HCI_RESET, &hdev->flags)) {
> > > 		if (ncmd) {
> >
> > so this is conflicting with the patch introducing the ncmd timeout handling.
> >
> My patch specifically addresses the issue observed in case of NOP command.
> It prevents the issue by handling NOP same as any other SIG command.
> 
> It looks commit de75cd0d9b2f3250d5f25846bb5632ccce6275f4 tries to
> recover when controller goes bad.
> 

Do you have any further comments here ? Waiting for your input. 

> > commit de75cd0d9b2f3250d5f25846bb5632ccce6275f4
> > Author: Manish Mandlik <mmandlik@google.com>
> > Date:   Thu Apr 29 10:24:22 2021 -0700
> >
> >     Bluetooth: Add ncmd=0 recovery handling
> >
> >     During command status or command complete event, the controller
> > may set
> >     ncmd=0 indicating that it is not accepting any more commands. In such a
> >     case, host holds off sending any more commands to the controller. If the
> >     controller doesn't recover from such condition, host will wait forever,
> >     until the user decides that the Bluetooth is broken and may power cycles
> >     the Bluetooth.
> >
> >     This patch triggers the hardware error to reset the controller and
> >     driver when it gets into such state as there is no other wat out.
> >
> > Nowhere in your commit description you are addressing why is this the
> > right to do.
> >
> 
> Will fix it in the next version if you are OK with the current fix. Please let me
> know.
> 
> > Regards
> >
> > Marcel
> 
> Thanks,
> Kiran

Thanks,
Kiran



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] Bluetooth: Fix race condition in handling NOP command
  2021-08-12 10:55     ` K, Kiran
@ 2021-08-12 17:31       ` Luiz Augusto von Dentz
       [not found]         ` <CAGPPCLDsqa6Ae3rMOXaVAOsnvPTF3b-5ybdPbD2LptcMaCfhWA@mail.gmail.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Luiz Augusto von Dentz @ 2021-08-12 17:31 UTC (permalink / raw)
  To: K, Kiran
  Cc: Marcel Holtmann, BlueZ, Srivatsa, Ravishankar, Tumkur Narayan,
	Chethan, Manish Mandlik

Hi Manish,

On Thu, Aug 12, 2021 at 3:58 AM K, Kiran <kiran.k@intel.com> wrote:
>
> Hi Marcel,
>
> > -----Original Message-----
> > From: K, Kiran
> > Sent: Friday, August 6, 2021 8:14 PM
> > To: 'Marcel Holtmann' <marcel@holtmann.org>
> > Cc: BlueZ <linux-bluetooth@vger.kernel.org>; Srivatsa, Ravishankar
> > <ravishankar.srivatsa@intel.com>; Tumkur Narayan, Chethan
> > <chethan.tumkur.narayan@intel.com>
> > Subject: RE: [PATCH v1] Bluetooth: Fix race condition in handling NOP
> > command
> >
> > Hi Marcel,
> >
> > > -----Original Message-----
> > > From: Marcel Holtmann <marcel@holtmann.org>
> > > Sent: Thursday, August 5, 2021 6:41 PM
> > > To: K, Kiran <kiran.k@intel.com>
> > > Cc: BlueZ <linux-bluetooth@vger.kernel.org>; Srivatsa, Ravishankar
> > > <ravishankar.srivatsa@intel.com>; Tumkur Narayan, Chethan
> > > <chethan.tumkur.narayan@intel.com>
> > > Subject: Re: [PATCH v1] Bluetooth: Fix race condition in handling NOP
> > > command
> > >
> > > Hi Kiran,
> > >
> > > > For NOP command, need to cancel work scheduled on cmd_timer, on
> > > > receiving command status or commmand complete event.
> > > >
> > > > Below use case might lead to race condition multiple when NOP
> > > > commands are queued sequentially:
> > > >
> > > > hci_cmd_work() {
> > > >   if (atomic_read(&hdev->cmd_cnt) {
> > > >            .
> > > >            .
> > > >            .
> > > >      atomic_dec(&hdev->cmd_cnt);
> > > >      hci_send_frame(hdev,...);
> > > >      schedule_delayed_work(&hdev->cmd_timer,...);
> > > >   }
> > > > }
> > > >
> > > > On receiving event for first NOP, the work scheduled on
> > > > hdev->cmd_timer is not cancelled and  second NOP is dequeued and
> > > > hdev->sent
> > > to controller.
> > > >
> > > > While waiting for an event for second NOP command, work scheduled on
> > > > cmd_timer for first NOP can get scheduled, resulting in sending
> > > > third NOP command not waiting for an event for second NOP. This
> > > > might cause issues at controller side (like memory overrun,
> > > > controller going
> > > > unresponsive) resulting in hci tx timeouts, hardware errors etc.
> > > >
> > > > Signed-off-by: Kiran K <kiran.k@intel.com>
> > > > Reviewed-by: Chethan T N <chethan.tumkur.narayan@intel.com>
> > > > Reviewed-by: Srivatsa Ravishankar <ravishankar.srivatsa@intel.com>
> > > > ---
> > > > net/bluetooth/hci_event.c | 3 +--
> > > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > > >
> > > > diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> > > > index ea7fc09478be..14dfbdc8b81b 100644
> > > > --- a/net/bluetooth/hci_event.c
> > > > +++ b/net/bluetooth/hci_event.c
> > > > @@ -3271,8 +3271,7 @@ static void hci_remote_features_evt(struct
> > > > hci_dev *hdev, static inline void handle_cmd_cnt_and_timer(struct
> > > > hci_dev
> > > *hdev,
> > > >                                       u16 opcode, u8 ncmd)
> > > > {
> > > > - if (opcode != HCI_OP_NOP)
> > > > -         cancel_delayed_work(&hdev->cmd_timer);
> > > > + cancel_delayed_work(&hdev->cmd_timer);
> > > >
> > > >   if (!test_bit(HCI_RESET, &hdev->flags)) {
> > > >           if (ncmd) {
> > >
> > > so this is conflicting with the patch introducing the ncmd timeout handling.
> > >
> > My patch specifically addresses the issue observed in case of NOP command.
> > It prevents the issue by handling NOP same as any other SIG command.
> >
> > It looks commit de75cd0d9b2f3250d5f25846bb5632ccce6275f4 tries to
> > recover when controller goes bad.
> >
>
> Do you have any further comments here ? Waiting for your input.
>
> > > commit de75cd0d9b2f3250d5f25846bb5632ccce6275f4
> > > Author: Manish Mandlik <mmandlik@google.com>
> > > Date:   Thu Apr 29 10:24:22 2021 -0700
> > >
> > >     Bluetooth: Add ncmd=0 recovery handling
> > >
> > >     During command status or command complete event, the controller
> > > may set
> > >     ncmd=0 indicating that it is not accepting any more commands. In such a
> > >     case, host holds off sending any more commands to the controller. If the
> > >     controller doesn't recover from such condition, host will wait forever,
> > >     until the user decides that the Bluetooth is broken and may power cycles
> > >     the Bluetooth.
> > >
> > >     This patch triggers the hardware error to reset the controller and
> > >     driver when it gets into such state as there is no other wat out.
> > >
> > > Nowhere in your commit description you are addressing why is this the
> > > right to do.
> > >
> >
> > Will fix it in the next version if you are OK with the current fix. Please let me
> > know.

Can you confirm this change doesn't break your patch above?

> >
> > > Regards
> > >
> > > Marcel
> >
> > Thanks,
> > Kiran
>
> Thanks,
> Kiran
>
>


-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH v1] Bluetooth: Fix race condition in handling NOP command
       [not found]         ` <CAGPPCLDsqa6Ae3rMOXaVAOsnvPTF3b-5ybdPbD2LptcMaCfhWA@mail.gmail.com>
@ 2021-08-15 23:29           ` K, Kiran
  0 siblings, 0 replies; 7+ messages in thread
From: K, Kiran @ 2021-08-15 23:29 UTC (permalink / raw)
  To: Manish Mandlik, Luiz Augusto von Dentz
  Cc: Marcel Holtmann, BlueZ, Srivatsa, Ravishankar, Tumkur Narayan, Chethan

Hi Manish, 
> 
> From: Manish Mandlik <mmandlik@google.com> 
> Sent: Saturday, August 14, 2021 5:20 AM
> To: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
> Cc: K, Kiran <kiran.k@intel.com>; Marcel Holtmann <marcel@holtmann.org>; BlueZ <linux-bluetooth@vger.kernel.org>; Srivatsa, Ravishankar <ravishankar.srivatsa@intel.com>; Tumkur Narayan, Chethan <chethan.tumkur.narayan@intel.com>
> Subject: Re: [PATCH v1] Bluetooth: Fix race condition in handling NOP command
> 
> Hi Luiz,
> 
> This patch looks ok to me, it'll not break ncmd timeout handling. 
> 
> @Kiran: Just one nit on the patch: we can get rid of the argument 'u16 opcode' of the function 'handle_cmd_cnt_and_timer()' as it won't be used anymore in this case.

Thanks for the comment. I fix and send out an updated version.
> 
> Regards,
> Manish.
> 
> 
> On Thu, Aug 12, 2021 at 10:31 AM Luiz Augusto von Dentz <mailto:luiz.dentz@gmail.com> wrote:
> Hi Manish,
> 
> On Thu, Aug 12, 2021 at 3:58 AM K, Kiran <mailto:kiran.k@intel.com> wrote:
> >
> > Hi Marcel,
> >
> > > -----Original Message-----
> > > From: K, Kiran
> > > Sent: Friday, August 6, 2021 8:14 PM
> > > To: 'Marcel Holtmann' <mailto:marcel@holtmann.org>
> > > Cc: BlueZ <mailto:linux-bluetooth@vger.kernel.org>; Srivatsa, Ravishankar
> > > <mailto:ravishankar.srivatsa@intel.com>; Tumkur Narayan, Chethan
> > > <mailto:chethan.tumkur.narayan@intel.com>
> > > Subject: RE: [PATCH v1] Bluetooth: Fix race condition in handling NOP
> > > command
> > >
> > > Hi Marcel,
> > >
> > > > -----Original Message-----
> > > > From: Marcel Holtmann <mailto:marcel@holtmann.org>
> > > > Sent: Thursday, August 5, 2021 6:41 PM
> > > > To: K, Kiran <mailto:kiran.k@intel.com>
> > > > Cc: BlueZ <mailto:linux-bluetooth@vger.kernel.org>; Srivatsa, Ravishankar
> > > > <mailto:ravishankar.srivatsa@intel.com>; Tumkur Narayan, Chethan
> > > > <mailto:chethan.tumkur.narayan@intel.com>
> > > > Subject: Re: [PATCH v1] Bluetooth: Fix race condition in handling NOP
> > > > command
> > > >
> > > > Hi Kiran,
> > > >
> > > > > For NOP command, need to cancel work scheduled on cmd_timer, on
> > > > > receiving command status or commmand complete event.
> > > > >
> > > > > Below use case might lead to race condition multiple when NOP
> > > > > commands are queued sequentially:
> > > > >
> > > > > hci_cmd_work() {
> > > > >   if (atomic_read(&hdev->cmd_cnt) {
> > > > >            .
> > > > >            .
> > > > >            .
> > > > >      atomic_dec(&hdev->cmd_cnt);
> > > > >      hci_send_frame(hdev,...);
> > > > >      schedule_delayed_work(&hdev->cmd_timer,...);
> > > > >   }
> > > > > }
> > > > >
> > > > > On receiving event for first NOP, the work scheduled on
> > > > > hdev->cmd_timer is not cancelled and  second NOP is dequeued and
> > > > > hdev->sent
> > > > to controller.
> > > > >
> > > > > While waiting for an event for second NOP command, work scheduled on
> > > > > cmd_timer for first NOP can get scheduled, resulting in sending
> > > > > third NOP command not waiting for an event for second NOP. This
> > > > > might cause issues at controller side (like memory overrun,
> > > > > controller going
> > > > > unresponsive) resulting in hci tx timeouts, hardware errors etc.
> > > > >
> > > > > Signed-off-by: Kiran K <mailto:kiran.k@intel.com>
> > > > > Reviewed-by: Chethan T N <mailto:chethan.tumkur.narayan@intel.com>
> > > > > Reviewed-by: Srivatsa Ravishankar <mailto:ravishankar.srivatsa@intel.com>
> > > > > ---
> > > > > net/bluetooth/hci_event.c | 3 +--
> > > > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> > > > > index ea7fc09478be..14dfbdc8b81b 100644
> > > > > --- a/net/bluetooth/hci_event.c
> > > > > +++ b/net/bluetooth/hci_event.c
> > > > > @@ -3271,8 +3271,7 @@ static void hci_remote_features_evt(struct
> > > > > hci_dev *hdev, static inline void handle_cmd_cnt_and_timer(struct
> > > > > hci_dev
> > > > *hdev,
> > > > >                                       u16 opcode, u8 ncmd)
> > > > > {
> > > > > - if (opcode != HCI_OP_NOP)
> > > > > -         cancel_delayed_work(&hdev->cmd_timer);
> > > > > + cancel_delayed_work(&hdev->cmd_timer);
> > > > >
> > > > >   if (!test_bit(HCI_RESET, &hdev->flags)) {
> > > > >           if (ncmd) {
> > > >
> > > > so this is conflicting with the patch introducing the ncmd timeout handling.
> > > >
> > > My patch specifically addresses the issue observed in case of NOP command.
> > > It prevents the issue by handling NOP same as any other SIG command.
> > >
> > > It looks commit de75cd0d9b2f3250d5f25846bb5632ccce6275f4 tries to
> > > recover when controller goes bad.
> > >
> >
> > Do you have any further comments here ? Waiting for your input.
> >
> > > > commit de75cd0d9b2f3250d5f25846bb5632ccce6275f4
> > > > Author: Manish Mandlik <mailto:mmandlik@google.com>
> > > > Date:   Thu Apr 29 10:24:22 2021 -0700
> > > >
> > > >     Bluetooth: Add ncmd=0 recovery handling
> > > >
> > > >     During command status or command complete event, the controller
> > > > may set
> > > >     ncmd=0 indicating that it is not accepting any more commands. In such a
> > > >     case, host holds off sending any more commands to the controller. If the
> > > >     controller doesn't recover from such condition, host will wait forever,
> > > >     until the user decides that the Bluetooth is broken and may power cycles
> > > >     the Bluetooth.
> > > >
> > > >     This patch triggers the hardware error to reset the controller and
> > > >     driver when it gets into such state as there is no other wat out.
> > > >
> > > > Nowhere in your commit description you are addressing why is this the
> > > > right to do.
> > > >
> > >
> > > Will fix it in the next version if you are OK with the current fix. Please let me
> > > know.
> 
> Can you confirm this change doesn't break your patch above?
> 
> > >
> > > > Regards
> > > >
> > > > Marcel
> > >
> > > Thanks,
> > > Kiran
> >
> > Thanks,
> > Kiran
> >
> >
> 
> 
> -- 
> Luiz Augusto von Dentz
>

Regards,
Kiran


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-08-15 23:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-04 17:39 [PATCH v1] Bluetooth: Fix race condition in handling NOP command Kiran K
2021-08-04 18:13 ` [v1] " bluez.test.bot
2021-08-05 13:11 ` [PATCH v1] " Marcel Holtmann
2021-08-06 14:44   ` K, Kiran
2021-08-12 10:55     ` K, Kiran
2021-08-12 17:31       ` Luiz Augusto von Dentz
     [not found]         ` <CAGPPCLDsqa6Ae3rMOXaVAOsnvPTF3b-5ybdPbD2LptcMaCfhWA@mail.gmail.com>
2021-08-15 23:29           ` K, Kiran

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.