All of lore.kernel.org
 help / color / mirror / Atom feed
* shutdown(3) and bluetooth.
@ 2013-11-12 21:11 Dave Jones
  2013-11-12 21:13   ` David Miller
       [not found] ` <20131112211125.GA2912-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  0 siblings, 2 replies; 17+ messages in thread
From: Dave Jones @ 2013-11-12 21:11 UTC (permalink / raw)
  To: netdev

Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()

Is there something I should be doing to guarantee that this operation
will either time out, or return instantly ?

In this specific case, I doubt anything is on the "sender" end of the socket, so
it's going to be waiting forever for a state change that won't arrive.

	Dave

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
       [not found] ` <20131112211125.GA2912-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-11-12 21:13   ` David Miller
  0 siblings, 0 replies; 17+ messages in thread
From: David Miller @ 2013-11-12 21:13 UTC (permalink / raw)
  To: davej; +Cc: netdev, linux-bluetooth, linux-wireless

From: Dave Jones <davej@redhat.com>
Date: Tue, 12 Nov 2013 16:11:25 -0500

> Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
> and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
> 
> Is there something I should be doing to guarantee that this operation
> will either time out, or return instantly ?
> 
> In this specific case, I doubt anything is on the "sender" end of the socket, so
> it's going to be waiting forever for a state change that won't arrive.

Adding bluetooth and wireless lists.  Dave, please consult MAINTAINERS when
asking questions like this, thanks!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
@ 2013-11-12 21:13   ` David Miller
  0 siblings, 0 replies; 17+ messages in thread
From: David Miller @ 2013-11-12 21:13 UTC (permalink / raw)
  To: davej-H+wXaHxf7aLQT0dZR+AlfA
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA

From: Dave Jones <davej-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Date: Tue, 12 Nov 2013 16:11:25 -0500

> Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
> and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
> 
> Is there something I should be doing to guarantee that this operation
> will either time out, or return instantly ?
> 
> In this specific case, I doubt anything is on the "sender" end of the socket, so
> it's going to be waiting forever for a state change that won't arrive.

Adding bluetooth and wireless lists.  Dave, please consult MAINTAINERS when
asking questions like this, thanks!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
  2013-11-12 21:11 shutdown(3) and bluetooth Dave Jones
@ 2013-11-12 21:56     ` Marcel Holtmann
       [not found] ` <20131112211125.GA2912-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
  1 sibling, 0 replies; 17+ messages in thread
From: Marcel Holtmann @ 2013-11-12 21:56 UTC (permalink / raw)
  To: Dave Jones
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
	development

Hi Dave,

> Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
> and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
> 
> Is there something I should be doing to guarantee that this operation
> will either time out, or return instantly ?
> 
> In this specific case, I doubt anything is on the "sender" end of the socket, so
> it's going to be waiting forever for a state change that won't arrive.

can you give us some extra information here. What kind of Bluetooth socket is this actually. From the top of my head, I have no idea why we would even wait forever. Normally when all low-level links are gone, the socket will shut down anyway.

Regards

Marcel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
@ 2013-11-12 21:56     ` Marcel Holtmann
  0 siblings, 0 replies; 17+ messages in thread
From: Marcel Holtmann @ 2013-11-12 21:56 UTC (permalink / raw)
  To: Dave Jones; +Cc: netdev, linux-bluetooth@vger.kernel.org development

Hi Dave,

> Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
> and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
> 
> Is there something I should be doing to guarantee that this operation
> will either time out, or return instantly ?
> 
> In this specific case, I doubt anything is on the "sender" end of the socket, so
> it's going to be waiting forever for a state change that won't arrive.

can you give us some extra information here. What kind of Bluetooth socket is this actually. From the top of my head, I have no idea why we would even wait forever. Normally when all low-level links are gone, the socket will shut down anyway.

Regards

Marcel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
  2013-11-12 21:56     ` Marcel Holtmann
@ 2013-11-12 22:10         ` Dave Jones
  -1 siblings, 0 replies; 17+ messages in thread
From: Dave Jones @ 2013-11-12 22:10 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
	development

On Wed, Nov 13, 2013 at 06:56:23AM +0900, Marcel Holtmann wrote:
 > Hi Dave,
 > 
 > > Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
 > > and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
 > > 
 > > Is there something I should be doing to guarantee that this operation
 > > will either time out, or return instantly ?
 > > 
 > > In this specific case, I doubt anything is on the "sender" end of the socket, so
 > > it's going to be waiting forever for a state change that won't arrive.
 > 
 > can you give us some extra information here. What kind of Bluetooth socket is this actually. From the top of my head, I have no idea why we would even wait forever. Normally when all low-level links are gone, the socket will shut down anyway.

Here's the info I found in the logs, it looks like this was the only bluetooth socket.

 fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
 Setsockopt(1 d 2134000 8) on fd 195

it doesn't look like any further operations were done on this fd during the fuzzers runtime.

Quick way to reproduce:

./trinity -P PF_BLUETOOTH -l off -c setsockopt

let it run a few seconds, and then ctrl-c.  The main process will never exit.

 5814 pts/6    Ss     0:00              |       \_ bash
 5876 pts/6    S+     0:00              |       |   \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
 5877 pts/6    Z+     0:00              |       |       \_ [trinity] <defunct>
 5878 pts/6    S+     0:01              |       |       \_ [trinity-main]

$ sudo cat /proc/5878/stack
[<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
[<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
[<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]
[<ffffffff81532fcf>] sock_release+0x1f/0x80
[<ffffffff81533042>] sock_close+0x12/0x20
[<ffffffff811a9ac1>] __fput+0xe1/0x230
[<ffffffff811a9c5e>] ____fput+0xe/0x10
[<ffffffff8108534c>] task_work_run+0xbc/0xe0
[<ffffffff8106944c>] do_exit+0x2bc/0xa20
[<ffffffff81069c2f>] do_group_exit+0x3f/0xa0
[<ffffffff81069ca4>] SyS_exit_group+0x14/0x20
[<ffffffff81656b27>] tracesys+0xdd/0xe2
[<ffffffffffffffff>] 0xffffffffffffffff


	Dave

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
@ 2013-11-12 22:10         ` Dave Jones
  0 siblings, 0 replies; 17+ messages in thread
From: Dave Jones @ 2013-11-12 22:10 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: netdev, linux-bluetooth@vger.kernel.org development

On Wed, Nov 13, 2013 at 06:56:23AM +0900, Marcel Holtmann wrote:
 > Hi Dave,
 > 
 > > Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
 > > and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
 > > 
 > > Is there something I should be doing to guarantee that this operation
 > > will either time out, or return instantly ?
 > > 
 > > In this specific case, I doubt anything is on the "sender" end of the socket, so
 > > it's going to be waiting forever for a state change that won't arrive.
 > 
 > can you give us some extra information here. What kind of Bluetooth socket is this actually. From the top of my head, I have no idea why we would even wait forever. Normally when all low-level links are gone, the socket will shut down anyway.

Here's the info I found in the logs, it looks like this was the only bluetooth socket.

 fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
 Setsockopt(1 d 2134000 8) on fd 195

it doesn't look like any further operations were done on this fd during the fuzzers runtime.

Quick way to reproduce:

./trinity -P PF_BLUETOOTH -l off -c setsockopt

let it run a few seconds, and then ctrl-c.  The main process will never exit.

 5814 pts/6    Ss     0:00              |       \_ bash
 5876 pts/6    S+     0:00              |       |   \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
 5877 pts/6    Z+     0:00              |       |       \_ [trinity] <defunct>
 5878 pts/6    S+     0:01              |       |       \_ [trinity-main]

$ sudo cat /proc/5878/stack
[<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
[<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
[<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]
[<ffffffff81532fcf>] sock_release+0x1f/0x80
[<ffffffff81533042>] sock_close+0x12/0x20
[<ffffffff811a9ac1>] __fput+0xe1/0x230
[<ffffffff811a9c5e>] ____fput+0xe/0x10
[<ffffffff8108534c>] task_work_run+0xbc/0xe0
[<ffffffff8106944c>] do_exit+0x2bc/0xa20
[<ffffffff81069c2f>] do_group_exit+0x3f/0xa0
[<ffffffff81069ca4>] SyS_exit_group+0x14/0x20
[<ffffffff81656b27>] tracesys+0xdd/0xe2
[<ffffffffffffffff>] 0xffffffffffffffff


	Dave

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
  2013-11-12 22:10         ` Dave Jones
@ 2013-11-12 22:32             ` Marcel Holtmann
  -1 siblings, 0 replies; 17+ messages in thread
From: Marcel Holtmann @ 2013-11-12 22:32 UTC (permalink / raw)
  To: Dave Jones
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
	development

Hi Dave,

>>> Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
>>> and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
>>> 
>>> Is there something I should be doing to guarantee that this operation
>>> will either time out, or return instantly ?
>>> 
>>> In this specific case, I doubt anything is on the "sender" end of the socket, so
>>> it's going to be waiting forever for a state change that won't arrive.
>> 
>> can you give us some extra information here. What kind of Bluetooth socket is this actually. From the top of my head, I have no idea why we would even wait forever. Normally when all low-level links are gone, the socket will shut down anyway.
> 
> Here's the info I found in the logs, it looks like this was the only bluetooth socket.
> 
> fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
> Setsockopt(1 d 2134000 8) on fd 195

this is a bit confusing. Protocol 2 is actually SCO, but the stack trace shows RFCOMM.

> it doesn't look like any further operations were done on this fd during the fuzzers runtime.
> 
> Quick way to reproduce:
> 
> ./trinity -P PF_BLUETOOTH -l off -c setsockopt
> 
> let it run a few seconds, and then ctrl-c.  The main process will never exit.
> 
> 5814 pts/6    Ss     0:00              |       \_ bash
> 5876 pts/6    S+     0:00              |       |   \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
> 5877 pts/6    Z+     0:00              |       |       \_ [trinity] <defunct>
> 5878 pts/6    S+     0:01              |       |       \_ [trinity-main]
> 
> $ sudo cat /proc/5878/stack
> [<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
> [<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
> [<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]
> [<ffffffff81532fcf>] sock_release+0x1f/0x80
> [<ffffffff81533042>] sock_close+0x12/0x20
> [<ffffffff811a9ac1>] __fput+0xe1/0x230
> [<ffffffff811a9c5e>] ____fput+0xe/0x10
> [<ffffffff8108534c>] task_work_run+0xbc/0xe0
> [<ffffffff8106944c>] do_exit+0x2bc/0xa20
> [<ffffffff81069c2f>] do_group_exit+0x3f/0xa0
> [<ffffffff81069ca4>] SyS_exit_group+0x14/0x20
> [<ffffffff81656b27>] tracesys+0xdd/0xe2
> [<ffffffffffffffff>] 0xffffffffffffffff

What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly. There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.

Regards

Marcel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
@ 2013-11-12 22:32             ` Marcel Holtmann
  0 siblings, 0 replies; 17+ messages in thread
From: Marcel Holtmann @ 2013-11-12 22:32 UTC (permalink / raw)
  To: Dave Jones; +Cc: netdev, linux-bluetooth@vger.kernel.org development

Hi Dave,

>>> Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
>>> and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
>>> 
>>> Is there something I should be doing to guarantee that this operation
>>> will either time out, or return instantly ?
>>> 
>>> In this specific case, I doubt anything is on the "sender" end of the socket, so
>>> it's going to be waiting forever for a state change that won't arrive.
>> 
>> can you give us some extra information here. What kind of Bluetooth socket is this actually. From the top of my head, I have no idea why we would even wait forever. Normally when all low-level links are gone, the socket will shut down anyway.
> 
> Here's the info I found in the logs, it looks like this was the only bluetooth socket.
> 
> fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
> Setsockopt(1 d 2134000 8) on fd 195

this is a bit confusing. Protocol 2 is actually SCO, but the stack trace shows RFCOMM.

> it doesn't look like any further operations were done on this fd during the fuzzers runtime.
> 
> Quick way to reproduce:
> 
> ./trinity -P PF_BLUETOOTH -l off -c setsockopt
> 
> let it run a few seconds, and then ctrl-c.  The main process will never exit.
> 
> 5814 pts/6    Ss     0:00              |       \_ bash
> 5876 pts/6    S+     0:00              |       |   \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
> 5877 pts/6    Z+     0:00              |       |       \_ [trinity] <defunct>
> 5878 pts/6    S+     0:01              |       |       \_ [trinity-main]
> 
> $ sudo cat /proc/5878/stack
> [<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
> [<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
> [<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]
> [<ffffffff81532fcf>] sock_release+0x1f/0x80
> [<ffffffff81533042>] sock_close+0x12/0x20
> [<ffffffff811a9ac1>] __fput+0xe1/0x230
> [<ffffffff811a9c5e>] ____fput+0xe/0x10
> [<ffffffff8108534c>] task_work_run+0xbc/0xe0
> [<ffffffff8106944c>] do_exit+0x2bc/0xa20
> [<ffffffff81069c2f>] do_group_exit+0x3f/0xa0
> [<ffffffff81069ca4>] SyS_exit_group+0x14/0x20
> [<ffffffff81656b27>] tracesys+0xdd/0xe2
> [<ffffffffffffffff>] 0xffffffffffffffff

What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly. There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.

Regards

Marcel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
  2013-11-12 22:32             ` Marcel Holtmann
@ 2013-11-12 22:48                 ` Dave Jones
  -1 siblings, 0 replies; 17+ messages in thread
From: Dave Jones @ 2013-11-12 22:48 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
	development

On Wed, Nov 13, 2013 at 07:32:09AM +0900, Marcel Holtmann wrote:
 
 > > Here's the info I found in the logs, it looks like this was the only bluetooth socket.
 > > 
 > > fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
 > > Setsockopt(1 d 2134000 8) on fd 195
 > 
 > this is a bit confusing. Protocol 2 is actually SCO, but the stack trace shows RFCOMM.
 
Sorry, mixed up two separate runs. In the log above, the stack trace is actually..

[<ffffffffa0492dca>] bt_sock_wait_state+0xda/0x240 [bluetooth]
[<ffffffffa04c86d8>] sco_sock_release+0xb8/0xf0 [bluetooth]
[<ffffffff815cb1ff>] sock_release+0x1f/0x90
[<ffffffff815cb282>] sock_close+0x12/0x20


 > > ./trinity -P PF_BLUETOOTH -l off -c setsockopt
 > > 
 > > let it run a few seconds, and then ctrl-c.  The main process will never exit.
 > > 
 > > 5814 pts/6    Ss     0:00              |       \_ bash
 > > 5876 pts/6    S+     0:00              |       |   \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
 > > 5877 pts/6    Z+     0:00              |       |       \_ [trinity] <defunct>
 > > 5878 pts/6    S+     0:01              |       |       \_ [trinity-main]
 > > 
 > > $ sudo cat /proc/5878/stack
 > > [<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
 > > [<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
 > > [<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]

So it seems it affects both SCO and RFCOMM.

 > What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly.
 > There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.

first noticed it on Linus' latest HEAD, and then reproduced it on 3.11.6
I'll look at linux-next tomorrow.

thanks,

	Dave

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
@ 2013-11-12 22:48                 ` Dave Jones
  0 siblings, 0 replies; 17+ messages in thread
From: Dave Jones @ 2013-11-12 22:48 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: netdev, linux-bluetooth@vger.kernel.org development

On Wed, Nov 13, 2013 at 07:32:09AM +0900, Marcel Holtmann wrote:
 
 > > Here's the info I found in the logs, it looks like this was the only bluetooth socket.
 > > 
 > > fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
 > > Setsockopt(1 d 2134000 8) on fd 195
 > 
 > this is a bit confusing. Protocol 2 is actually SCO, but the stack trace shows RFCOMM.
 
Sorry, mixed up two separate runs. In the log above, the stack trace is actually..

[<ffffffffa0492dca>] bt_sock_wait_state+0xda/0x240 [bluetooth]
[<ffffffffa04c86d8>] sco_sock_release+0xb8/0xf0 [bluetooth]
[<ffffffff815cb1ff>] sock_release+0x1f/0x90
[<ffffffff815cb282>] sock_close+0x12/0x20


 > > ./trinity -P PF_BLUETOOTH -l off -c setsockopt
 > > 
 > > let it run a few seconds, and then ctrl-c.  The main process will never exit.
 > > 
 > > 5814 pts/6    Ss     0:00              |       \_ bash
 > > 5876 pts/6    S+     0:00              |       |   \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
 > > 5877 pts/6    Z+     0:00              |       |       \_ [trinity] <defunct>
 > > 5878 pts/6    S+     0:01              |       |       \_ [trinity-main]
 > > 
 > > $ sudo cat /proc/5878/stack
 > > [<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
 > > [<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
 > > [<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]

So it seems it affects both SCO and RFCOMM.

 > What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly.
 > There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.

first noticed it on Linus' latest HEAD, and then reproduced it on 3.11.6
I'll look at linux-next tomorrow.

thanks,

	Dave

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
  2013-11-12 22:48                 ` Dave Jones
@ 2013-11-12 23:37                     ` Marcel Holtmann
  -1 siblings, 0 replies; 17+ messages in thread
From: Marcel Holtmann @ 2013-11-12 23:37 UTC (permalink / raw)
  To: Dave Jones
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
	development

Hi Dave,

>>> Here's the info I found in the logs, it looks like this was the only bluetooth socket.
>>> 
>>> fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
>>> Setsockopt(1 d 2134000 8) on fd 195
>> 
>> this is a bit confusing. Protocol 2 is actually SCO, but the stack trace shows RFCOMM.
> 
> Sorry, mixed up two separate runs. In the log above, the stack trace is actually..
> 
> [<ffffffffa0492dca>] bt_sock_wait_state+0xda/0x240 [bluetooth]
> [<ffffffffa04c86d8>] sco_sock_release+0xb8/0xf0 [bluetooth]
> [<ffffffff815cb1ff>] sock_release+0x1f/0x90
> [<ffffffff815cb282>] sock_close+0x12/0x20
> 
> 
>>> ./trinity -P PF_BLUETOOTH -l off -c setsockopt
>>> 
>>> let it run a few seconds, and then ctrl-c.  The main process will never exit.
>>> 
>>> 5814 pts/6    Ss     0:00              |       \_ bash
>>> 5876 pts/6    S+     0:00              |       |   \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
>>> 5877 pts/6    Z+     0:00              |       |       \_ [trinity] <defunct>
>>> 5878 pts/6    S+     0:01              |       |       \_ [trinity-main]
>>> 
>>> $ sudo cat /proc/5878/stack
>>> [<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
>>> [<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
>>> [<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]
> 
> So it seems it affects both SCO and RFCOMM.
> 
>> What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly.
>> There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.
> 
> first noticed it on Linus' latest HEAD, and then reproduced it on 3.11.6
> I'll look at linux-next tomorrow.

I looked through the code and only call bt_sock_wait_state when SOCK_LINGER and sk_lingertime is set. In that case we actually block until the socket state changes to BT_CLOSED.

The only way I see this could happen is if you have a huge linger timeout and confused the socket state before. What is actually the list of system calls that you are throwing at this socket.

Regards

Marcel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
@ 2013-11-12 23:37                     ` Marcel Holtmann
  0 siblings, 0 replies; 17+ messages in thread
From: Marcel Holtmann @ 2013-11-12 23:37 UTC (permalink / raw)
  To: Dave Jones; +Cc: netdev, linux-bluetooth@vger.kernel.org development

Hi Dave,

>>> Here's the info I found in the logs, it looks like this was the only bluetooth socket.
>>> 
>>> fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
>>> Setsockopt(1 d 2134000 8) on fd 195
>> 
>> this is a bit confusing. Protocol 2 is actually SCO, but the stack trace shows RFCOMM.
> 
> Sorry, mixed up two separate runs. In the log above, the stack trace is actually..
> 
> [<ffffffffa0492dca>] bt_sock_wait_state+0xda/0x240 [bluetooth]
> [<ffffffffa04c86d8>] sco_sock_release+0xb8/0xf0 [bluetooth]
> [<ffffffff815cb1ff>] sock_release+0x1f/0x90
> [<ffffffff815cb282>] sock_close+0x12/0x20
> 
> 
>>> ./trinity -P PF_BLUETOOTH -l off -c setsockopt
>>> 
>>> let it run a few seconds, and then ctrl-c.  The main process will never exit.
>>> 
>>> 5814 pts/6    Ss     0:00              |       \_ bash
>>> 5876 pts/6    S+     0:00              |       |   \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
>>> 5877 pts/6    Z+     0:00              |       |       \_ [trinity] <defunct>
>>> 5878 pts/6    S+     0:01              |       |       \_ [trinity-main]
>>> 
>>> $ sudo cat /proc/5878/stack
>>> [<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
>>> [<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
>>> [<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]
> 
> So it seems it affects both SCO and RFCOMM.
> 
>> What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly.
>> There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.
> 
> first noticed it on Linus' latest HEAD, and then reproduced it on 3.11.6
> I'll look at linux-next tomorrow.

I looked through the code and only call bt_sock_wait_state when SOCK_LINGER and sk_lingertime is set. In that case we actually block until the socket state changes to BT_CLOSED.

The only way I see this could happen is if you have a huge linger timeout and confused the socket state before. What is actually the list of system calls that you are throwing at this socket.

Regards

Marcel


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
  2013-11-12 23:37                     ` Marcel Holtmann
@ 2013-11-13  0:28                         ` Dave Jones
  -1 siblings, 0 replies; 17+ messages in thread
From: Dave Jones @ 2013-11-13  0:28 UTC (permalink / raw)
  To: Marcel Holtmann
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-bluetooth-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
	development

On Wed, Nov 13, 2013 at 08:37:15AM +0900, Marcel Holtmann wrote:
 
 > > So it seems it affects both SCO and RFCOMM.
 > > 
 > >> What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly.
 > >> There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.
 > > 
 > > first noticed it on Linus' latest HEAD, and then reproduced it on 3.11.6
 > > I'll look at linux-next tomorrow.
 > 
 > I looked through the code and only call bt_sock_wait_state when SOCK_LINGER and sk_lingertime is set. In that case we actually block until the socket state changes to BT_CLOSED.
 > 
 > The only way I see this could happen is if you have a huge linger timeout and confused the socket state before. What is actually the list of system calls that you are throwing at this socket.

Ah. I recently changed some code that's now doing this on every socket at shutdown..
(simplified cut-n-paste)

        struct linger ling = { .l_onoff = FALSE, };

        for (i = 0; i < nr_sockets; i++) {
                fd = shm->sockets[i].fd;
                shm->sockets[i].fd = 0;

                setsockopt(fd, SOL_SOCKET, SO_LINGER, &ling, sizeof(struct linger));
                shutdown(fd, SHUT_RDWR);
                close(fd);
        }

I could just rip out that linger code completely and just hope that sockets staying in
TIME_WAIT is good enough. iirc, I added it when after multiple runs, some of the
weirder protocols would fail to open a socket once a certain number of existing
sockets had opened, even if they were in SOCK_WAIT

two remaining questions though. That code is setting linger to false. Why would
that cause the sk_lingertime to be taken into consideration ?  And why is this
only a problem for bluetooth (apparently) ?

	Dave

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
@ 2013-11-13  0:28                         ` Dave Jones
  0 siblings, 0 replies; 17+ messages in thread
From: Dave Jones @ 2013-11-13  0:28 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: netdev, linux-bluetooth@vger.kernel.org development

On Wed, Nov 13, 2013 at 08:37:15AM +0900, Marcel Holtmann wrote:
 
 > > So it seems it affects both SCO and RFCOMM.
 > > 
 > >> What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly.
 > >> There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.
 > > 
 > > first noticed it on Linus' latest HEAD, and then reproduced it on 3.11.6
 > > I'll look at linux-next tomorrow.
 > 
 > I looked through the code and only call bt_sock_wait_state when SOCK_LINGER and sk_lingertime is set. In that case we actually block until the socket state changes to BT_CLOSED.
 > 
 > The only way I see this could happen is if you have a huge linger timeout and confused the socket state before. What is actually the list of system calls that you are throwing at this socket.

Ah. I recently changed some code that's now doing this on every socket at shutdown..
(simplified cut-n-paste)

        struct linger ling = { .l_onoff = FALSE, };

        for (i = 0; i < nr_sockets; i++) {
                fd = shm->sockets[i].fd;
                shm->sockets[i].fd = 0;

                setsockopt(fd, SOL_SOCKET, SO_LINGER, &ling, sizeof(struct linger));
                shutdown(fd, SHUT_RDWR);
                close(fd);
        }

I could just rip out that linger code completely and just hope that sockets staying in
TIME_WAIT is good enough. iirc, I added it when after multiple runs, some of the
weirder protocols would fail to open a socket once a certain number of existing
sockets had opened, even if they were in SOCK_WAIT

two remaining questions though. That code is setting linger to false. Why would
that cause the sk_lingertime to be taken into consideration ?  And why is this
only a problem for bluetooth (apparently) ?

	Dave

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
  2013-11-13  0:28                         ` Dave Jones
  (?)
@ 2013-11-13  1:58                         ` Marcel Holtmann
  -1 siblings, 0 replies; 17+ messages in thread
From: Marcel Holtmann @ 2013-11-13  1:58 UTC (permalink / raw)
  To: Dave Jones; +Cc: netdev, linux-bluetooth@vger.kernel.org development

Hi Dave,

>>> So it seems it affects both SCO and RFCOMM.
>>> 
>>>> What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly.
>>>> There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.
>>> 
>>> first noticed it on Linus' latest HEAD, and then reproduced it on 3.11.6
>>> I'll look at linux-next tomorrow.
>> 
>> I looked through the code and only call bt_sock_wait_state when SOCK_LINGER and sk_lingertime is set. In that case we actually block until the socket state changes to BT_CLOSED.
>> 
>> The only way I see this could happen is if you have a huge linger timeout and confused the socket state before. What is actually the list of system calls that you are throwing at this socket.
> 
> Ah. I recently changed some code that's now doing this on every socket at shutdown..
> (simplified cut-n-paste)
> 
>        struct linger ling = { .l_onoff = FALSE, };
> 
>        for (i = 0; i < nr_sockets; i++) {
>                fd = shm->sockets[i].fd;
>                shm->sockets[i].fd = 0;
> 
>                setsockopt(fd, SOL_SOCKET, SO_LINGER, &ling, sizeof(struct linger));
>                shutdown(fd, SHUT_RDWR);
>                close(fd);
>        }
> 
> I could just rip out that linger code completely and just hope that sockets staying in
> TIME_WAIT is good enough. iirc, I added it when after multiple runs, some of the
> weirder protocols would fail to open a socket once a certain number of existing
> sockets had opened, even if they were in SOCK_WAIT
> 
> two remaining questions though. That code is setting linger to false. Why would
> that cause the sk_lingertime to be taken into consideration ?  And why is this
> only a problem for bluetooth (apparently) ?

we are not touching that part of setsockopt. That is handled by net/core/sock.c and we just check if SOCK_LINGER flag is set and if we have a positive sk_lingertime. So this is a bit suspicious on why this is happening, but I don’t think it is our mistake.

Regards

Marcel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: shutdown(3) and bluetooth.
  2013-11-12 21:13   ` David Miller
  (?)
@ 2013-11-13 14:02   ` John W. Linville
  -1 siblings, 0 replies; 17+ messages in thread
From: John W. Linville @ 2013-11-13 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: davej, netdev, linux-bluetooth, linux-wireless

On Tue, Nov 12, 2013 at 04:13:50PM -0500, David Miller wrote:
> From: Dave Jones <davej@redhat.com>
> Date: Tue, 12 Nov 2013 16:11:25 -0500
> 
> > Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
> > and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
> > 
> > Is there something I should be doing to guarantee that this operation
> > will either time out, or return instantly ?
> > 
> > In this specific case, I doubt anything is on the "sender" end of the socket, so
> > it's going to be waiting forever for a state change that won't arrive.
> 
> Adding bluetooth and wireless lists.  Dave, please consult MAINTAINERS when
> asking questions like this, thanks!

I don't have an authoritative answer.  I do, however, seem to recall
that trying to shutdown a SunOS box with a hung NFS mount would seem
to hang forever.  I don't think that is a great metric for how we
should behave, of course...

John
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-11-13 14:15 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-12 21:11 shutdown(3) and bluetooth Dave Jones
2013-11-12 21:13 ` David Miller
2013-11-12 21:13   ` David Miller
2013-11-13 14:02   ` John W. Linville
     [not found] ` <20131112211125.GA2912-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-11-12 21:56   ` Marcel Holtmann
2013-11-12 21:56     ` Marcel Holtmann
     [not found]     ` <DF4C2B40-BD87-4E88-911D-E3E5F488CAE4-kz+m5ild9QBg9hUCZPvPmw@public.gmane.org>
2013-11-12 22:10       ` Dave Jones
2013-11-12 22:10         ` Dave Jones
     [not found]         ` <20131112221038.GA6689-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-11-12 22:32           ` Marcel Holtmann
2013-11-12 22:32             ` Marcel Holtmann
     [not found]             ` <FC5CE013-B077-4EA5-81C1-A7D8B4A5EF85-kz+m5ild9QBg9hUCZPvPmw@public.gmane.org>
2013-11-12 22:48               ` Dave Jones
2013-11-12 22:48                 ` Dave Jones
     [not found]                 ` <20131112224819.GE9057-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-11-12 23:37                   ` Marcel Holtmann
2013-11-12 23:37                     ` Marcel Holtmann
     [not found]                     ` <D8BE686E-E81D-48CD-8D67-2B138191E0CC-kz+m5ild9QBg9hUCZPvPmw@public.gmane.org>
2013-11-13  0:28                       ` Dave Jones
2013-11-13  0:28                         ` Dave Jones
2013-11-13  1:58                         ` Marcel Holtmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.