* shutdown(3) and bluetooth.
@ 2013-11-12 21:11 Dave Jones
2013-11-12 21:13 ` David Miller
[not found] ` <20131112211125.GA2912-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
0 siblings, 2 replies; 17+ messages in thread
From: Dave Jones @ 2013-11-12 21:11 UTC (permalink / raw)
To: netdev
Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
Is there something I should be doing to guarantee that this operation
will either time out, or return instantly ?
In this specific case, I doubt anything is on the "sender" end of the socket, so
it's going to be waiting forever for a state change that won't arrive.
Dave
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
[not found] ` <20131112211125.GA2912-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
@ 2013-11-12 21:13 ` David Miller
0 siblings, 0 replies; 17+ messages in thread
From: David Miller @ 2013-11-12 21:13 UTC (permalink / raw)
To: davej; +Cc: netdev, linux-bluetooth, linux-wireless
From: Dave Jones <davej@redhat.com>
Date: Tue, 12 Nov 2013 16:11:25 -0500
> Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
> and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
>
> Is there something I should be doing to guarantee that this operation
> will either time out, or return instantly ?
>
> In this specific case, I doubt anything is on the "sender" end of the socket, so
> it's going to be waiting forever for a state change that won't arrive.
Adding bluetooth and wireless lists. Dave, please consult MAINTAINERS when
asking questions like this, thanks!
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
@ 2013-11-12 21:13 ` David Miller
0 siblings, 0 replies; 17+ messages in thread
From: David Miller @ 2013-11-12 21:13 UTC (permalink / raw)
To: davej-H+wXaHxf7aLQT0dZR+AlfA
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
linux-bluetooth-u79uwXL29TY76Z2rM5mHXA,
linux-wireless-u79uwXL29TY76Z2rM5mHXA
From: Dave Jones <davej-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Date: Tue, 12 Nov 2013 16:11:25 -0500
> Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
> and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
>
> Is there something I should be doing to guarantee that this operation
> will either time out, or return instantly ?
>
> In this specific case, I doubt anything is on the "sender" end of the socket, so
> it's going to be waiting forever for a state change that won't arrive.
Adding bluetooth and wireless lists. Dave, please consult MAINTAINERS when
asking questions like this, thanks!
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
2013-11-12 21:11 shutdown(3) and bluetooth Dave Jones
@ 2013-11-12 21:56 ` Marcel Holtmann
[not found] ` <20131112211125.GA2912-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
1 sibling, 0 replies; 17+ messages in thread
From: Marcel Holtmann @ 2013-11-12 21:56 UTC (permalink / raw)
To: Dave Jones
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
linux-bluetooth-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
development
Hi Dave,
> Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
> and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
>
> Is there something I should be doing to guarantee that this operation
> will either time out, or return instantly ?
>
> In this specific case, I doubt anything is on the "sender" end of the socket, so
> it's going to be waiting forever for a state change that won't arrive.
can you give us some extra information here. What kind of Bluetooth socket is this actually. From the top of my head, I have no idea why we would even wait forever. Normally when all low-level links are gone, the socket will shut down anyway.
Regards
Marcel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
@ 2013-11-12 21:56 ` Marcel Holtmann
0 siblings, 0 replies; 17+ messages in thread
From: Marcel Holtmann @ 2013-11-12 21:56 UTC (permalink / raw)
To: Dave Jones; +Cc: netdev, linux-bluetooth@vger.kernel.org development
Hi Dave,
> Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
> and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
>
> Is there something I should be doing to guarantee that this operation
> will either time out, or return instantly ?
>
> In this specific case, I doubt anything is on the "sender" end of the socket, so
> it's going to be waiting forever for a state change that won't arrive.
can you give us some extra information here. What kind of Bluetooth socket is this actually. From the top of my head, I have no idea why we would even wait forever. Normally when all low-level links are gone, the socket will shut down anyway.
Regards
Marcel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
2013-11-12 21:56 ` Marcel Holtmann
@ 2013-11-12 22:10 ` Dave Jones
-1 siblings, 0 replies; 17+ messages in thread
From: Dave Jones @ 2013-11-12 22:10 UTC (permalink / raw)
To: Marcel Holtmann
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
linux-bluetooth-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
development
On Wed, Nov 13, 2013 at 06:56:23AM +0900, Marcel Holtmann wrote:
> Hi Dave,
>
> > Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
> > and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
> >
> > Is there something I should be doing to guarantee that this operation
> > will either time out, or return instantly ?
> >
> > In this specific case, I doubt anything is on the "sender" end of the socket, so
> > it's going to be waiting forever for a state change that won't arrive.
>
> can you give us some extra information here. What kind of Bluetooth socket is this actually. From the top of my head, I have no idea why we would even wait forever. Normally when all low-level links are gone, the socket will shut down anyway.
Here's the info I found in the logs, it looks like this was the only bluetooth socket.
fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
Setsockopt(1 d 2134000 8) on fd 195
it doesn't look like any further operations were done on this fd during the fuzzers runtime.
Quick way to reproduce:
./trinity -P PF_BLUETOOTH -l off -c setsockopt
let it run a few seconds, and then ctrl-c. The main process will never exit.
5814 pts/6 Ss 0:00 | \_ bash
5876 pts/6 S+ 0:00 | | \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
5877 pts/6 Z+ 0:00 | | \_ [trinity] <defunct>
5878 pts/6 S+ 0:01 | | \_ [trinity-main]
$ sudo cat /proc/5878/stack
[<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
[<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
[<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]
[<ffffffff81532fcf>] sock_release+0x1f/0x80
[<ffffffff81533042>] sock_close+0x12/0x20
[<ffffffff811a9ac1>] __fput+0xe1/0x230
[<ffffffff811a9c5e>] ____fput+0xe/0x10
[<ffffffff8108534c>] task_work_run+0xbc/0xe0
[<ffffffff8106944c>] do_exit+0x2bc/0xa20
[<ffffffff81069c2f>] do_group_exit+0x3f/0xa0
[<ffffffff81069ca4>] SyS_exit_group+0x14/0x20
[<ffffffff81656b27>] tracesys+0xdd/0xe2
[<ffffffffffffffff>] 0xffffffffffffffff
Dave
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
@ 2013-11-12 22:10 ` Dave Jones
0 siblings, 0 replies; 17+ messages in thread
From: Dave Jones @ 2013-11-12 22:10 UTC (permalink / raw)
To: Marcel Holtmann; +Cc: netdev, linux-bluetooth@vger.kernel.org development
On Wed, Nov 13, 2013 at 06:56:23AM +0900, Marcel Holtmann wrote:
> Hi Dave,
>
> > Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
> > and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
> >
> > Is there something I should be doing to guarantee that this operation
> > will either time out, or return instantly ?
> >
> > In this specific case, I doubt anything is on the "sender" end of the socket, so
> > it's going to be waiting forever for a state change that won't arrive.
>
> can you give us some extra information here. What kind of Bluetooth socket is this actually. From the top of my head, I have no idea why we would even wait forever. Normally when all low-level links are gone, the socket will shut down anyway.
Here's the info I found in the logs, it looks like this was the only bluetooth socket.
fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
Setsockopt(1 d 2134000 8) on fd 195
it doesn't look like any further operations were done on this fd during the fuzzers runtime.
Quick way to reproduce:
./trinity -P PF_BLUETOOTH -l off -c setsockopt
let it run a few seconds, and then ctrl-c. The main process will never exit.
5814 pts/6 Ss 0:00 | \_ bash
5876 pts/6 S+ 0:00 | | \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
5877 pts/6 Z+ 0:00 | | \_ [trinity] <defunct>
5878 pts/6 S+ 0:01 | | \_ [trinity-main]
$ sudo cat /proc/5878/stack
[<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
[<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
[<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]
[<ffffffff81532fcf>] sock_release+0x1f/0x80
[<ffffffff81533042>] sock_close+0x12/0x20
[<ffffffff811a9ac1>] __fput+0xe1/0x230
[<ffffffff811a9c5e>] ____fput+0xe/0x10
[<ffffffff8108534c>] task_work_run+0xbc/0xe0
[<ffffffff8106944c>] do_exit+0x2bc/0xa20
[<ffffffff81069c2f>] do_group_exit+0x3f/0xa0
[<ffffffff81069ca4>] SyS_exit_group+0x14/0x20
[<ffffffff81656b27>] tracesys+0xdd/0xe2
[<ffffffffffffffff>] 0xffffffffffffffff
Dave
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
2013-11-12 22:10 ` Dave Jones
@ 2013-11-12 22:32 ` Marcel Holtmann
-1 siblings, 0 replies; 17+ messages in thread
From: Marcel Holtmann @ 2013-11-12 22:32 UTC (permalink / raw)
To: Dave Jones
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
linux-bluetooth-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
development
Hi Dave,
>>> Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
>>> and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
>>>
>>> Is there something I should be doing to guarantee that this operation
>>> will either time out, or return instantly ?
>>>
>>> In this specific case, I doubt anything is on the "sender" end of the socket, so
>>> it's going to be waiting forever for a state change that won't arrive.
>>
>> can you give us some extra information here. What kind of Bluetooth socket is this actually. From the top of my head, I have no idea why we would even wait forever. Normally when all low-level links are gone, the socket will shut down anyway.
>
> Here's the info I found in the logs, it looks like this was the only bluetooth socket.
>
> fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
> Setsockopt(1 d 2134000 8) on fd 195
this is a bit confusing. Protocol 2 is actually SCO, but the stack trace shows RFCOMM.
> it doesn't look like any further operations were done on this fd during the fuzzers runtime.
>
> Quick way to reproduce:
>
> ./trinity -P PF_BLUETOOTH -l off -c setsockopt
>
> let it run a few seconds, and then ctrl-c. The main process will never exit.
>
> 5814 pts/6 Ss 0:00 | \_ bash
> 5876 pts/6 S+ 0:00 | | \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
> 5877 pts/6 Z+ 0:00 | | \_ [trinity] <defunct>
> 5878 pts/6 S+ 0:01 | | \_ [trinity-main]
>
> $ sudo cat /proc/5878/stack
> [<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
> [<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
> [<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]
> [<ffffffff81532fcf>] sock_release+0x1f/0x80
> [<ffffffff81533042>] sock_close+0x12/0x20
> [<ffffffff811a9ac1>] __fput+0xe1/0x230
> [<ffffffff811a9c5e>] ____fput+0xe/0x10
> [<ffffffff8108534c>] task_work_run+0xbc/0xe0
> [<ffffffff8106944c>] do_exit+0x2bc/0xa20
> [<ffffffff81069c2f>] do_group_exit+0x3f/0xa0
> [<ffffffff81069ca4>] SyS_exit_group+0x14/0x20
> [<ffffffff81656b27>] tracesys+0xdd/0xe2
> [<ffffffffffffffff>] 0xffffffffffffffff
What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly. There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.
Regards
Marcel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
@ 2013-11-12 22:32 ` Marcel Holtmann
0 siblings, 0 replies; 17+ messages in thread
From: Marcel Holtmann @ 2013-11-12 22:32 UTC (permalink / raw)
To: Dave Jones; +Cc: netdev, linux-bluetooth@vger.kernel.org development
Hi Dave,
>>> Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
>>> and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
>>>
>>> Is there something I should be doing to guarantee that this operation
>>> will either time out, or return instantly ?
>>>
>>> In this specific case, I doubt anything is on the "sender" end of the socket, so
>>> it's going to be waiting forever for a state change that won't arrive.
>>
>> can you give us some extra information here. What kind of Bluetooth socket is this actually. From the top of my head, I have no idea why we would even wait forever. Normally when all low-level links are gone, the socket will shut down anyway.
>
> Here's the info I found in the logs, it looks like this was the only bluetooth socket.
>
> fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
> Setsockopt(1 d 2134000 8) on fd 195
this is a bit confusing. Protocol 2 is actually SCO, but the stack trace shows RFCOMM.
> it doesn't look like any further operations were done on this fd during the fuzzers runtime.
>
> Quick way to reproduce:
>
> ./trinity -P PF_BLUETOOTH -l off -c setsockopt
>
> let it run a few seconds, and then ctrl-c. The main process will never exit.
>
> 5814 pts/6 Ss 0:00 | \_ bash
> 5876 pts/6 S+ 0:00 | | \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
> 5877 pts/6 Z+ 0:00 | | \_ [trinity] <defunct>
> 5878 pts/6 S+ 0:01 | | \_ [trinity-main]
>
> $ sudo cat /proc/5878/stack
> [<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
> [<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
> [<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]
> [<ffffffff81532fcf>] sock_release+0x1f/0x80
> [<ffffffff81533042>] sock_close+0x12/0x20
> [<ffffffff811a9ac1>] __fput+0xe1/0x230
> [<ffffffff811a9c5e>] ____fput+0xe/0x10
> [<ffffffff8108534c>] task_work_run+0xbc/0xe0
> [<ffffffff8106944c>] do_exit+0x2bc/0xa20
> [<ffffffff81069c2f>] do_group_exit+0x3f/0xa0
> [<ffffffff81069ca4>] SyS_exit_group+0x14/0x20
> [<ffffffff81656b27>] tracesys+0xdd/0xe2
> [<ffffffffffffffff>] 0xffffffffffffffff
What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly. There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.
Regards
Marcel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
2013-11-12 22:32 ` Marcel Holtmann
@ 2013-11-12 22:48 ` Dave Jones
-1 siblings, 0 replies; 17+ messages in thread
From: Dave Jones @ 2013-11-12 22:48 UTC (permalink / raw)
To: Marcel Holtmann
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
linux-bluetooth-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
development
On Wed, Nov 13, 2013 at 07:32:09AM +0900, Marcel Holtmann wrote:
> > Here's the info I found in the logs, it looks like this was the only bluetooth socket.
> >
> > fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
> > Setsockopt(1 d 2134000 8) on fd 195
>
> this is a bit confusing. Protocol 2 is actually SCO, but the stack trace shows RFCOMM.
Sorry, mixed up two separate runs. In the log above, the stack trace is actually..
[<ffffffffa0492dca>] bt_sock_wait_state+0xda/0x240 [bluetooth]
[<ffffffffa04c86d8>] sco_sock_release+0xb8/0xf0 [bluetooth]
[<ffffffff815cb1ff>] sock_release+0x1f/0x90
[<ffffffff815cb282>] sock_close+0x12/0x20
> > ./trinity -P PF_BLUETOOTH -l off -c setsockopt
> >
> > let it run a few seconds, and then ctrl-c. The main process will never exit.
> >
> > 5814 pts/6 Ss 0:00 | \_ bash
> > 5876 pts/6 S+ 0:00 | | \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
> > 5877 pts/6 Z+ 0:00 | | \_ [trinity] <defunct>
> > 5878 pts/6 S+ 0:01 | | \_ [trinity-main]
> >
> > $ sudo cat /proc/5878/stack
> > [<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
> > [<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
> > [<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]
So it seems it affects both SCO and RFCOMM.
> What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly.
> There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.
first noticed it on Linus' latest HEAD, and then reproduced it on 3.11.6
I'll look at linux-next tomorrow.
thanks,
Dave
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
@ 2013-11-12 22:48 ` Dave Jones
0 siblings, 0 replies; 17+ messages in thread
From: Dave Jones @ 2013-11-12 22:48 UTC (permalink / raw)
To: Marcel Holtmann; +Cc: netdev, linux-bluetooth@vger.kernel.org development
On Wed, Nov 13, 2013 at 07:32:09AM +0900, Marcel Holtmann wrote:
> > Here's the info I found in the logs, it looks like this was the only bluetooth socket.
> >
> > fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
> > Setsockopt(1 d 2134000 8) on fd 195
>
> this is a bit confusing. Protocol 2 is actually SCO, but the stack trace shows RFCOMM.
Sorry, mixed up two separate runs. In the log above, the stack trace is actually..
[<ffffffffa0492dca>] bt_sock_wait_state+0xda/0x240 [bluetooth]
[<ffffffffa04c86d8>] sco_sock_release+0xb8/0xf0 [bluetooth]
[<ffffffff815cb1ff>] sock_release+0x1f/0x90
[<ffffffff815cb282>] sock_close+0x12/0x20
> > ./trinity -P PF_BLUETOOTH -l off -c setsockopt
> >
> > let it run a few seconds, and then ctrl-c. The main process will never exit.
> >
> > 5814 pts/6 Ss 0:00 | \_ bash
> > 5876 pts/6 S+ 0:00 | | \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
> > 5877 pts/6 Z+ 0:00 | | \_ [trinity] <defunct>
> > 5878 pts/6 S+ 0:01 | | \_ [trinity-main]
> >
> > $ sudo cat /proc/5878/stack
> > [<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
> > [<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
> > [<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]
So it seems it affects both SCO and RFCOMM.
> What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly.
> There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.
first noticed it on Linus' latest HEAD, and then reproduced it on 3.11.6
I'll look at linux-next tomorrow.
thanks,
Dave
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
2013-11-12 22:48 ` Dave Jones
@ 2013-11-12 23:37 ` Marcel Holtmann
-1 siblings, 0 replies; 17+ messages in thread
From: Marcel Holtmann @ 2013-11-12 23:37 UTC (permalink / raw)
To: Dave Jones
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
linux-bluetooth-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
development
Hi Dave,
>>> Here's the info I found in the logs, it looks like this was the only bluetooth socket.
>>>
>>> fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
>>> Setsockopt(1 d 2134000 8) on fd 195
>>
>> this is a bit confusing. Protocol 2 is actually SCO, but the stack trace shows RFCOMM.
>
> Sorry, mixed up two separate runs. In the log above, the stack trace is actually..
>
> [<ffffffffa0492dca>] bt_sock_wait_state+0xda/0x240 [bluetooth]
> [<ffffffffa04c86d8>] sco_sock_release+0xb8/0xf0 [bluetooth]
> [<ffffffff815cb1ff>] sock_release+0x1f/0x90
> [<ffffffff815cb282>] sock_close+0x12/0x20
>
>
>>> ./trinity -P PF_BLUETOOTH -l off -c setsockopt
>>>
>>> let it run a few seconds, and then ctrl-c. The main process will never exit.
>>>
>>> 5814 pts/6 Ss 0:00 | \_ bash
>>> 5876 pts/6 S+ 0:00 | | \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
>>> 5877 pts/6 Z+ 0:00 | | \_ [trinity] <defunct>
>>> 5878 pts/6 S+ 0:01 | | \_ [trinity-main]
>>>
>>> $ sudo cat /proc/5878/stack
>>> [<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
>>> [<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
>>> [<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]
>
> So it seems it affects both SCO and RFCOMM.
>
>> What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly.
>> There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.
>
> first noticed it on Linus' latest HEAD, and then reproduced it on 3.11.6
> I'll look at linux-next tomorrow.
I looked through the code and only call bt_sock_wait_state when SOCK_LINGER and sk_lingertime is set. In that case we actually block until the socket state changes to BT_CLOSED.
The only way I see this could happen is if you have a huge linger timeout and confused the socket state before. What is actually the list of system calls that you are throwing at this socket.
Regards
Marcel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
@ 2013-11-12 23:37 ` Marcel Holtmann
0 siblings, 0 replies; 17+ messages in thread
From: Marcel Holtmann @ 2013-11-12 23:37 UTC (permalink / raw)
To: Dave Jones; +Cc: netdev, linux-bluetooth@vger.kernel.org development
Hi Dave,
>>> Here's the info I found in the logs, it looks like this was the only bluetooth socket.
>>>
>>> fd[195] = domain:31 (PF_BLUETOOTH) type:0x5 protocol:2
>>> Setsockopt(1 d 2134000 8) on fd 195
>>
>> this is a bit confusing. Protocol 2 is actually SCO, but the stack trace shows RFCOMM.
>
> Sorry, mixed up two separate runs. In the log above, the stack trace is actually..
>
> [<ffffffffa0492dca>] bt_sock_wait_state+0xda/0x240 [bluetooth]
> [<ffffffffa04c86d8>] sco_sock_release+0xb8/0xf0 [bluetooth]
> [<ffffffff815cb1ff>] sock_release+0x1f/0x90
> [<ffffffff815cb282>] sock_close+0x12/0x20
>
>
>>> ./trinity -P PF_BLUETOOTH -l off -c setsockopt
>>>
>>> let it run a few seconds, and then ctrl-c. The main process will never exit.
>>>
>>> 5814 pts/6 Ss 0:00 | \_ bash
>>> 5876 pts/6 S+ 0:00 | | \_ ./trinity -P PF_BLUETOOTH -l off -c setsockopt
>>> 5877 pts/6 Z+ 0:00 | | \_ [trinity] <defunct>
>>> 5878 pts/6 S+ 0:01 | | \_ [trinity-main]
>>>
>>> $ sudo cat /proc/5878/stack
>>> [<ffffffffa04397a2>] bt_sock_wait_state+0xc2/0x190 [bluetooth]
>>> [<ffffffffa0847a75>] rfcomm_sock_shutdown+0x85/0xb0 [rfcomm]
>>> [<ffffffffa0847ad9>] rfcomm_sock_release+0x39/0xb0 [rfcomm]
>
> So it seems it affects both SCO and RFCOMM.
>
>> What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly.
>> There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.
>
> first noticed it on Linus' latest HEAD, and then reproduced it on 3.11.6
> I'll look at linux-next tomorrow.
I looked through the code and only call bt_sock_wait_state when SOCK_LINGER and sk_lingertime is set. In that case we actually block until the socket state changes to BT_CLOSED.
The only way I see this could happen is if you have a huge linger timeout and confused the socket state before. What is actually the list of system calls that you are throwing at this socket.
Regards
Marcel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
2013-11-12 23:37 ` Marcel Holtmann
@ 2013-11-13 0:28 ` Dave Jones
-1 siblings, 0 replies; 17+ messages in thread
From: Dave Jones @ 2013-11-13 0:28 UTC (permalink / raw)
To: Marcel Holtmann
Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
linux-bluetooth-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
development
On Wed, Nov 13, 2013 at 08:37:15AM +0900, Marcel Holtmann wrote:
> > So it seems it affects both SCO and RFCOMM.
> >
> >> What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly.
> >> There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.
> >
> > first noticed it on Linus' latest HEAD, and then reproduced it on 3.11.6
> > I'll look at linux-next tomorrow.
>
> I looked through the code and only call bt_sock_wait_state when SOCK_LINGER and sk_lingertime is set. In that case we actually block until the socket state changes to BT_CLOSED.
>
> The only way I see this could happen is if you have a huge linger timeout and confused the socket state before. What is actually the list of system calls that you are throwing at this socket.
Ah. I recently changed some code that's now doing this on every socket at shutdown..
(simplified cut-n-paste)
struct linger ling = { .l_onoff = FALSE, };
for (i = 0; i < nr_sockets; i++) {
fd = shm->sockets[i].fd;
shm->sockets[i].fd = 0;
setsockopt(fd, SOL_SOCKET, SO_LINGER, &ling, sizeof(struct linger));
shutdown(fd, SHUT_RDWR);
close(fd);
}
I could just rip out that linger code completely and just hope that sockets staying in
TIME_WAIT is good enough. iirc, I added it when after multiple runs, some of the
weirder protocols would fail to open a socket once a certain number of existing
sockets had opened, even if they were in SOCK_WAIT
two remaining questions though. That code is setting linger to false. Why would
that cause the sk_lingertime to be taken into consideration ? And why is this
only a problem for bluetooth (apparently) ?
Dave
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
@ 2013-11-13 0:28 ` Dave Jones
0 siblings, 0 replies; 17+ messages in thread
From: Dave Jones @ 2013-11-13 0:28 UTC (permalink / raw)
To: Marcel Holtmann; +Cc: netdev, linux-bluetooth@vger.kernel.org development
On Wed, Nov 13, 2013 at 08:37:15AM +0900, Marcel Holtmann wrote:
> > So it seems it affects both SCO and RFCOMM.
> >
> >> What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly.
> >> There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.
> >
> > first noticed it on Linus' latest HEAD, and then reproduced it on 3.11.6
> > I'll look at linux-next tomorrow.
>
> I looked through the code and only call bt_sock_wait_state when SOCK_LINGER and sk_lingertime is set. In that case we actually block until the socket state changes to BT_CLOSED.
>
> The only way I see this could happen is if you have a huge linger timeout and confused the socket state before. What is actually the list of system calls that you are throwing at this socket.
Ah. I recently changed some code that's now doing this on every socket at shutdown..
(simplified cut-n-paste)
struct linger ling = { .l_onoff = FALSE, };
for (i = 0; i < nr_sockets; i++) {
fd = shm->sockets[i].fd;
shm->sockets[i].fd = 0;
setsockopt(fd, SOL_SOCKET, SO_LINGER, &ling, sizeof(struct linger));
shutdown(fd, SHUT_RDWR);
close(fd);
}
I could just rip out that linger code completely and just hope that sockets staying in
TIME_WAIT is good enough. iirc, I added it when after multiple runs, some of the
weirder protocols would fail to open a socket once a certain number of existing
sockets had opened, even if they were in SOCK_WAIT
two remaining questions though. That code is setting linger to false. Why would
that cause the sk_lingertime to be taken into consideration ? And why is this
only a problem for bluetooth (apparently) ?
Dave
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
2013-11-13 0:28 ` Dave Jones
(?)
@ 2013-11-13 1:58 ` Marcel Holtmann
-1 siblings, 0 replies; 17+ messages in thread
From: Marcel Holtmann @ 2013-11-13 1:58 UTC (permalink / raw)
To: Dave Jones; +Cc: netdev, linux-bluetooth@vger.kernel.org development
Hi Dave,
>>> So it seems it affects both SCO and RFCOMM.
>>>
>>>> What kernel did you run this against? It is a shot in the dark, but can you try linux-next quickly.
>>>> There was a socket related fix for the socket options where we confused RFCOMM vs L2CAP struct sock.
>>>
>>> first noticed it on Linus' latest HEAD, and then reproduced it on 3.11.6
>>> I'll look at linux-next tomorrow.
>>
>> I looked through the code and only call bt_sock_wait_state when SOCK_LINGER and sk_lingertime is set. In that case we actually block until the socket state changes to BT_CLOSED.
>>
>> The only way I see this could happen is if you have a huge linger timeout and confused the socket state before. What is actually the list of system calls that you are throwing at this socket.
>
> Ah. I recently changed some code that's now doing this on every socket at shutdown..
> (simplified cut-n-paste)
>
> struct linger ling = { .l_onoff = FALSE, };
>
> for (i = 0; i < nr_sockets; i++) {
> fd = shm->sockets[i].fd;
> shm->sockets[i].fd = 0;
>
> setsockopt(fd, SOL_SOCKET, SO_LINGER, &ling, sizeof(struct linger));
> shutdown(fd, SHUT_RDWR);
> close(fd);
> }
>
> I could just rip out that linger code completely and just hope that sockets staying in
> TIME_WAIT is good enough. iirc, I added it when after multiple runs, some of the
> weirder protocols would fail to open a socket once a certain number of existing
> sockets had opened, even if they were in SOCK_WAIT
>
> two remaining questions though. That code is setting linger to false. Why would
> that cause the sk_lingertime to be taken into consideration ? And why is this
> only a problem for bluetooth (apparently) ?
we are not touching that part of setsockopt. That is handled by net/core/sock.c and we just check if SOCK_LINGER flag is set and if we have a positive sk_lingertime. So this is a bit suspicious on why this is happening, but I don’t think it is our mistake.
Regards
Marcel
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: shutdown(3) and bluetooth.
2013-11-12 21:13 ` David Miller
(?)
@ 2013-11-13 14:02 ` John W. Linville
-1 siblings, 0 replies; 17+ messages in thread
From: John W. Linville @ 2013-11-13 14:02 UTC (permalink / raw)
To: David Miller; +Cc: davej, netdev, linux-bluetooth, linux-wireless
On Tue, Nov 12, 2013 at 04:13:50PM -0500, David Miller wrote:
> From: Dave Jones <davej@redhat.com>
> Date: Tue, 12 Nov 2013 16:11:25 -0500
>
> > Is shutdown() allowed to block indefinitely ? The man page doesn't say either way,
> > and I've noticed that my fuzz tester occasionally hangs for days spinning in bt_sock_wait_state()
> >
> > Is there something I should be doing to guarantee that this operation
> > will either time out, or return instantly ?
> >
> > In this specific case, I doubt anything is on the "sender" end of the socket, so
> > it's going to be waiting forever for a state change that won't arrive.
>
> Adding bluetooth and wireless lists. Dave, please consult MAINTAINERS when
> asking questions like this, thanks!
I don't have an authoritative answer. I do, however, seem to recall
that trying to shutdown a SunOS box with a hung NFS mount would seem
to hang forever. I don't think that is a great metric for how we
should behave, of course...
John
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2013-11-13 14:15 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-12 21:11 shutdown(3) and bluetooth Dave Jones
2013-11-12 21:13 ` David Miller
2013-11-12 21:13 ` David Miller
2013-11-13 14:02 ` John W. Linville
[not found] ` <20131112211125.GA2912-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-11-12 21:56 ` Marcel Holtmann
2013-11-12 21:56 ` Marcel Holtmann
[not found] ` <DF4C2B40-BD87-4E88-911D-E3E5F488CAE4-kz+m5ild9QBg9hUCZPvPmw@public.gmane.org>
2013-11-12 22:10 ` Dave Jones
2013-11-12 22:10 ` Dave Jones
[not found] ` <20131112221038.GA6689-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-11-12 22:32 ` Marcel Holtmann
2013-11-12 22:32 ` Marcel Holtmann
[not found] ` <FC5CE013-B077-4EA5-81C1-A7D8B4A5EF85-kz+m5ild9QBg9hUCZPvPmw@public.gmane.org>
2013-11-12 22:48 ` Dave Jones
2013-11-12 22:48 ` Dave Jones
[not found] ` <20131112224819.GE9057-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2013-11-12 23:37 ` Marcel Holtmann
2013-11-12 23:37 ` Marcel Holtmann
[not found] ` <D8BE686E-E81D-48CD-8D67-2B138191E0CC-kz+m5ild9QBg9hUCZPvPmw@public.gmane.org>
2013-11-13 0:28 ` Dave Jones
2013-11-13 0:28 ` Dave Jones
2013-11-13 1:58 ` Marcel Holtmann
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.