netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
@ 2023-01-03 11:12 Ying Hsu
  2023-01-04  6:51 ` Leon Romanovsky
  0 siblings, 1 reply; 9+ messages in thread
From: Ying Hsu @ 2023-01-03 11:12 UTC (permalink / raw)
  To: linux-bluetooth
  Cc: chromeos-bluetooth-upstreaming, Ying Hsu, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Johan Hedberg,
	Luiz Augusto von Dentz, Marcel Holtmann, Paolo Abeni,
	linux-kernel, netdev

There's a possible deadlock when two processes are connecting
and closing concurrently:
  + CPU0: __rfcomm_dlc_close locks rfcomm and then calls
  rfcomm_sk_state_change which locks the sock.
  + CPU1: rfcomm_sock_connect locks the socket and then calls
  rfcomm_dlc_open which locks rfcomm.

Here's the call trace:

-> #2 (&d->lock){+.+.}-{3:3}:
       __mutex_lock_common kernel/locking/mutex.c:603 [inline]
       __mutex_lock0x12f/0x1360 kernel/locking/mutex.c:747
       __rfcomm_dlc_close+0x15d/0x890 net/bluetooth/rfcomm/core.c:487
       rfcomm_dlc_close+1e9/0x240 net/bluetooth/rfcomm/core.c:520
       __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
       rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
       rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
       __sock_release+0xcd/0x280 net/socket.c:650
       sock_close+0x1c/0x20 net/socket.c:1365
       __fput+0x27c/0xa90 fs/file_table.c:320
       task_work_run+0x16f/0x270 kernel/task_work.c:179
       exit_task_work include/linux/task_work.h:38 [inline]
       do_exit+0xaa8/0x2950 kernel/exit.c:867
       do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
       get_signal+0x21c3/0x2450 kernel/signal.c:2859
       arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
       exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
       exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
       __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
       syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
       do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
       entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #1 (rfcomm_mutex){+.+.}-{3:3}:
       __mutex_lock_common kernel/locking/mutex.c:603 [inline]
       __mutex_lock+0x12f/0x1360 kernel/locking/mutex.c:747
       rfcomm_dlc_open+0x93/0xa80 net/bluetooth/rfcomm/core.c:425
       rfcomm_sock_connect+0x329/0x450 net/bluetooth/rfcomm/sock.c:413
       __sys_connect_file+0x153/0x1a0 net/socket.c:1976
       __sys_connect+0x165/0x1a0 net/socket.c:1993
       __do_sys_connect net/socket.c:2003 [inline]
       __se_sys_connect net/socket.c:2000 [inline]
       __x64_sys_connect+0x73/0xb0 net/socket.c:2000
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}:
       check_prev_add kernel/locking/lockdep.c:3097 [inline]
       check_prevs_add kernel/locking/lockdep.c:3216 [inline]
       validate_chain kernel/locking/lockdep.c:3831 [inline]
       __lock_acquire+0x2a43/0x56d0 kernel/locking/lockdep.c:5055
       lock_acquire kernel/locking/lockdep.c:5668 [inline]
       lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633
       lock_sock_nested+0x3a/0xf0 net/core/sock.c:3470
       lock_sock include/net/sock.h:1725 [inline]
       rfcomm_sk_state_change+0x6d/0x3a0 net/bluetooth/rfcomm/sock.c:73
       __rfcomm_dlc_close+0x1b1/0x890 net/bluetooth/rfcomm/core.c:489
       rfcomm_dlc_close+0x1e9/0x240 net/bluetooth/rfcomm/core.c:520
       __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
       rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
       rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
       __sock_release+0xcd/0x280 net/socket.c:650
       sock_close+0x1c/0x20 net/socket.c:1365
       __fput+0x27c/0xa90 fs/file_table.c:320
       task_work_run+0x16f/0x270 kernel/task_work.c:179
       exit_task_work include/linux/task_work.h:38 [inline]
       do_exit+0xaa8/0x2950 kernel/exit.c:867
       do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
       get_signal+0x21c3/0x2450 kernel/signal.c:2859
       arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
       exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
       exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
       __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
       syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
       do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
       entry_SYSCALL_64_after_hwframe+0x63/0xcd

Signed-off-by: Ying Hsu <yinghsu@chromium.org>
---
This commit has been tested with a C reproducer on qemu-x86_64.

 net/bluetooth/rfcomm/sock.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
index 21e24da4847f..29f9a88a3dc8 100644
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -410,8 +410,10 @@ static int rfcomm_sock_connect(struct socket *sock, struct sockaddr *addr, int a
 	d->sec_level = rfcomm_pi(sk)->sec_level;
 	d->role_switch = rfcomm_pi(sk)->role_switch;
 
+	release_sock(sk);
 	err = rfcomm_dlc_open(d, &rfcomm_pi(sk)->src, &sa->rc_bdaddr,
 			      sa->rc_channel);
+	lock_sock(sk);
 	if (!err)
 		err = bt_sock_wait_state(sk, BT_CONNECTED,
 				sock_sndtimeo(sk, flags & O_NONBLOCK));
-- 
2.39.0.314.g84b9a713c41-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
  2023-01-03 11:12 [PATCH] Bluetooth: Fix possible deadlock in rfcomm_sk_state_change Ying Hsu
@ 2023-01-04  6:51 ` Leon Romanovsky
  2023-01-06  0:24   ` Saeed Mahameed
  0 siblings, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2023-01-04  6:51 UTC (permalink / raw)
  To: Ying Hsu
  Cc: linux-bluetooth, chromeos-bluetooth-upstreaming, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Johan Hedberg,
	Luiz Augusto von Dentz, Marcel Holtmann, Paolo Abeni,
	linux-kernel, netdev

On Tue, Jan 03, 2023 at 11:12:46AM +0000, Ying Hsu wrote:
> There's a possible deadlock when two processes are connecting
> and closing concurrently:
>   + CPU0: __rfcomm_dlc_close locks rfcomm and then calls
>   rfcomm_sk_state_change which locks the sock.
>   + CPU1: rfcomm_sock_connect locks the socket and then calls
>   rfcomm_dlc_open which locks rfcomm.
> 
> Here's the call trace:
> 
> -> #2 (&d->lock){+.+.}-{3:3}:
>        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
>        __mutex_lock0x12f/0x1360 kernel/locking/mutex.c:747
>        __rfcomm_dlc_close+0x15d/0x890 net/bluetooth/rfcomm/core.c:487
>        rfcomm_dlc_close+1e9/0x240 net/bluetooth/rfcomm/core.c:520
>        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
>        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
>        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
>        __sock_release+0xcd/0x280 net/socket.c:650
>        sock_close+0x1c/0x20 net/socket.c:1365
>        __fput+0x27c/0xa90 fs/file_table.c:320
>        task_work_run+0x16f/0x270 kernel/task_work.c:179
>        exit_task_work include/linux/task_work.h:38 [inline]
>        do_exit+0xaa8/0x2950 kernel/exit.c:867
>        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
>        get_signal+0x21c3/0x2450 kernel/signal.c:2859
>        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
>        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
>        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
> -> #1 (rfcomm_mutex){+.+.}-{3:3}:
>        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
>        __mutex_lock+0x12f/0x1360 kernel/locking/mutex.c:747
>        rfcomm_dlc_open+0x93/0xa80 net/bluetooth/rfcomm/core.c:425
>        rfcomm_sock_connect+0x329/0x450 net/bluetooth/rfcomm/sock.c:413
>        __sys_connect_file+0x153/0x1a0 net/socket.c:1976
>        __sys_connect+0x165/0x1a0 net/socket.c:1993
>        __do_sys_connect net/socket.c:2003 [inline]
>        __se_sys_connect net/socket.c:2000 [inline]
>        __x64_sys_connect+0x73/0xb0 net/socket.c:2000
>        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>        do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
>        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
> -> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}:
>        check_prev_add kernel/locking/lockdep.c:3097 [inline]
>        check_prevs_add kernel/locking/lockdep.c:3216 [inline]
>        validate_chain kernel/locking/lockdep.c:3831 [inline]
>        __lock_acquire+0x2a43/0x56d0 kernel/locking/lockdep.c:5055
>        lock_acquire kernel/locking/lockdep.c:5668 [inline]
>        lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633
>        lock_sock_nested+0x3a/0xf0 net/core/sock.c:3470
>        lock_sock include/net/sock.h:1725 [inline]
>        rfcomm_sk_state_change+0x6d/0x3a0 net/bluetooth/rfcomm/sock.c:73
>        __rfcomm_dlc_close+0x1b1/0x890 net/bluetooth/rfcomm/core.c:489
>        rfcomm_dlc_close+0x1e9/0x240 net/bluetooth/rfcomm/core.c:520
>        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
>        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
>        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
>        __sock_release+0xcd/0x280 net/socket.c:650
>        sock_close+0x1c/0x20 net/socket.c:1365
>        __fput+0x27c/0xa90 fs/file_table.c:320
>        task_work_run+0x16f/0x270 kernel/task_work.c:179
>        exit_task_work include/linux/task_work.h:38 [inline]
>        do_exit+0xaa8/0x2950 kernel/exit.c:867
>        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
>        get_signal+0x21c3/0x2450 kernel/signal.c:2859
>        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
>        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
>        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
> Signed-off-by: Ying Hsu <yinghsu@chromium.org>
> ---
> This commit has been tested with a C reproducer on qemu-x86_64.
> 
>  net/bluetooth/rfcomm/sock.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
> index 21e24da4847f..29f9a88a3dc8 100644
> --- a/net/bluetooth/rfcomm/sock.c
> +++ b/net/bluetooth/rfcomm/sock.c
> @@ -410,8 +410,10 @@ static int rfcomm_sock_connect(struct socket *sock, struct sockaddr *addr, int a
>  	d->sec_level = rfcomm_pi(sk)->sec_level;
>  	d->role_switch = rfcomm_pi(sk)->role_switch;
>  
> +	release_sock(sk);
>  	err = rfcomm_dlc_open(d, &rfcomm_pi(sk)->src, &sa->rc_bdaddr,
                                           ^^^^
Are you sure that "sk" still exists here after you called to release_sock(sk)?
What prevents from use-after-free here?

>  			      sa->rc_channel);
> +	lock_sock(sk);
>  	if (!err)
>  		err = bt_sock_wait_state(sk, BT_CONNECTED,
>  				sock_sndtimeo(sk, flags & O_NONBLOCK));
> -- 
> 2.39.0.314.g84b9a713c41-goog
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
  2023-01-04  6:51 ` Leon Romanovsky
@ 2023-01-06  0:24   ` Saeed Mahameed
  2023-01-08 10:12     ` Leon Romanovsky
  0 siblings, 1 reply; 9+ messages in thread
From: Saeed Mahameed @ 2023-01-06  0:24 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Ying Hsu, linux-bluetooth, chromeos-bluetooth-upstreaming,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Johan Hedberg,
	Luiz Augusto von Dentz, Marcel Holtmann, Paolo Abeni,
	linux-kernel, netdev

On 04 Jan 08:51, Leon Romanovsky wrote:
>On Tue, Jan 03, 2023 at 11:12:46AM +0000, Ying Hsu wrote:
>> There's a possible deadlock when two processes are connecting
>> and closing concurrently:
>>   + CPU0: __rfcomm_dlc_close locks rfcomm and then calls
>>   rfcomm_sk_state_change which locks the sock.
>>   + CPU1: rfcomm_sock_connect locks the socket and then calls
>>   rfcomm_dlc_open which locks rfcomm.
>>
>> Here's the call trace:
>>
>> -> #2 (&d->lock){+.+.}-{3:3}:
>>        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
>>        __mutex_lock0x12f/0x1360 kernel/locking/mutex.c:747
>>        __rfcomm_dlc_close+0x15d/0x890 net/bluetooth/rfcomm/core.c:487
>>        rfcomm_dlc_close+1e9/0x240 net/bluetooth/rfcomm/core.c:520
>>        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
>>        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
>>        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
>>        __sock_release+0xcd/0x280 net/socket.c:650
>>        sock_close+0x1c/0x20 net/socket.c:1365
>>        __fput+0x27c/0xa90 fs/file_table.c:320
>>        task_work_run+0x16f/0x270 kernel/task_work.c:179
>>        exit_task_work include/linux/task_work.h:38 [inline]
>>        do_exit+0xaa8/0x2950 kernel/exit.c:867
>>        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
>>        get_signal+0x21c3/0x2450 kernel/signal.c:2859
>>        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
>>        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>>        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
>>        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>>        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>>        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>>        entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>
>> -> #1 (rfcomm_mutex){+.+.}-{3:3}:
>>        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
>>        __mutex_lock+0x12f/0x1360 kernel/locking/mutex.c:747
>>        rfcomm_dlc_open+0x93/0xa80 net/bluetooth/rfcomm/core.c:425
>>        rfcomm_sock_connect+0x329/0x450 net/bluetooth/rfcomm/sock.c:413
>>        __sys_connect_file+0x153/0x1a0 net/socket.c:1976
>>        __sys_connect+0x165/0x1a0 net/socket.c:1993
>>        __do_sys_connect net/socket.c:2003 [inline]
>>        __se_sys_connect net/socket.c:2000 [inline]
>>        __x64_sys_connect+0x73/0xb0 net/socket.c:2000
>>        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>>        do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
>>        entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>
>> -> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}:
>>        check_prev_add kernel/locking/lockdep.c:3097 [inline]
>>        check_prevs_add kernel/locking/lockdep.c:3216 [inline]
>>        validate_chain kernel/locking/lockdep.c:3831 [inline]
>>        __lock_acquire+0x2a43/0x56d0 kernel/locking/lockdep.c:5055
>>        lock_acquire kernel/locking/lockdep.c:5668 [inline]
>>        lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633
>>        lock_sock_nested+0x3a/0xf0 net/core/sock.c:3470
>>        lock_sock include/net/sock.h:1725 [inline]
>>        rfcomm_sk_state_change+0x6d/0x3a0 net/bluetooth/rfcomm/sock.c:73
>>        __rfcomm_dlc_close+0x1b1/0x890 net/bluetooth/rfcomm/core.c:489
>>        rfcomm_dlc_close+0x1e9/0x240 net/bluetooth/rfcomm/core.c:520
>>        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
>>        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
>>        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
>>        __sock_release+0xcd/0x280 net/socket.c:650
>>        sock_close+0x1c/0x20 net/socket.c:1365
>>        __fput+0x27c/0xa90 fs/file_table.c:320
>>        task_work_run+0x16f/0x270 kernel/task_work.c:179
>>        exit_task_work include/linux/task_work.h:38 [inline]
>>        do_exit+0xaa8/0x2950 kernel/exit.c:867
>>        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
>>        get_signal+0x21c3/0x2450 kernel/signal.c:2859
>>        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
>>        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>>        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
>>        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>>        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>>        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>>        entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>
>> Signed-off-by: Ying Hsu <yinghsu@chromium.org>
>> ---
>> This commit has been tested with a C reproducer on qemu-x86_64.
>>
>>  net/bluetooth/rfcomm/sock.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
>> index 21e24da4847f..29f9a88a3dc8 100644
>> --- a/net/bluetooth/rfcomm/sock.c
>> +++ b/net/bluetooth/rfcomm/sock.c
>> @@ -410,8 +410,10 @@ static int rfcomm_sock_connect(struct socket *sock, struct sockaddr *addr, int a
>>  	d->sec_level = rfcomm_pi(sk)->sec_level;
>>  	d->role_switch = rfcomm_pi(sk)->role_switch;
>>
>> +	release_sock(sk);
>>  	err = rfcomm_dlc_open(d, &rfcomm_pi(sk)->src, &sa->rc_bdaddr,
>                                           ^^^^
>Are you sure that "sk" still exists here after you called to release_sock(sk)?
>What prevents from use-after-free here?
>

sk must be valid to be locked in first place.
release_sock() has mutex unlock semantics so it doesn't free anything..

from a quick review it doesn't seem to be necessary to keep the lock held
around rfcomm_dlc_open, and looking into bt_sock_wait_state, it also
releases the lock temporarily internally ..  so it won't matter if the lock
got released temporarily earlier, IMHO. 
Though you might want to make sure rfcomm_pi(sk)->src won't change,
it seems like other functions use &rfcomm_sk_list.lock to protect it.

Reviewed-by: Saeed Mahameed <saeed@kernel.org>

>>  			      sa->rc_channel);
>> +	lock_sock(sk);
>>  	if (!err)
>>  		err = bt_sock_wait_state(sk, BT_CONNECTED,
>>  				sock_sndtimeo(sk, flags & O_NONBLOCK));
>> --
>> 2.39.0.314.g84b9a713c41-goog
>>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
  2023-01-06  0:24   ` Saeed Mahameed
@ 2023-01-08 10:12     ` Leon Romanovsky
  2023-01-10  9:07       ` Saeed Mahameed
  0 siblings, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2023-01-08 10:12 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Ying Hsu, linux-bluetooth, chromeos-bluetooth-upstreaming,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Johan Hedberg,
	Luiz Augusto von Dentz, Marcel Holtmann, Paolo Abeni,
	linux-kernel, netdev

On Thu, Jan 05, 2023 at 04:24:10PM -0800, Saeed Mahameed wrote:
> On 04 Jan 08:51, Leon Romanovsky wrote:
> > On Tue, Jan 03, 2023 at 11:12:46AM +0000, Ying Hsu wrote:
> > > There's a possible deadlock when two processes are connecting
> > > and closing concurrently:
> > >   + CPU0: __rfcomm_dlc_close locks rfcomm and then calls
> > >   rfcomm_sk_state_change which locks the sock.
> > >   + CPU1: rfcomm_sock_connect locks the socket and then calls
> > >   rfcomm_dlc_open which locks rfcomm.
> > > 
> > > Here's the call trace:
> > > 
> > > -> #2 (&d->lock){+.+.}-{3:3}:
> > >        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
> > >        __mutex_lock0x12f/0x1360 kernel/locking/mutex.c:747
> > >        __rfcomm_dlc_close+0x15d/0x890 net/bluetooth/rfcomm/core.c:487
> > >        rfcomm_dlc_close+1e9/0x240 net/bluetooth/rfcomm/core.c:520
> > >        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
> > >        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
> > >        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
> > >        __sock_release+0xcd/0x280 net/socket.c:650
> > >        sock_close+0x1c/0x20 net/socket.c:1365
> > >        __fput+0x27c/0xa90 fs/file_table.c:320
> > >        task_work_run+0x16f/0x270 kernel/task_work.c:179
> > >        exit_task_work include/linux/task_work.h:38 [inline]
> > >        do_exit+0xaa8/0x2950 kernel/exit.c:867
> > >        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
> > >        get_signal+0x21c3/0x2450 kernel/signal.c:2859
> > >        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
> > >        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
> > >        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
> > >        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
> > >        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
> > >        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
> > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > 
> > > -> #1 (rfcomm_mutex){+.+.}-{3:3}:
> > >        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
> > >        __mutex_lock+0x12f/0x1360 kernel/locking/mutex.c:747
> > >        rfcomm_dlc_open+0x93/0xa80 net/bluetooth/rfcomm/core.c:425
> > >        rfcomm_sock_connect+0x329/0x450 net/bluetooth/rfcomm/sock.c:413
> > >        __sys_connect_file+0x153/0x1a0 net/socket.c:1976
> > >        __sys_connect+0x165/0x1a0 net/socket.c:1993
> > >        __do_sys_connect net/socket.c:2003 [inline]
> > >        __se_sys_connect net/socket.c:2000 [inline]
> > >        __x64_sys_connect+0x73/0xb0 net/socket.c:2000
> > >        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > >        do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > 
> > > -> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}:
> > >        check_prev_add kernel/locking/lockdep.c:3097 [inline]
> > >        check_prevs_add kernel/locking/lockdep.c:3216 [inline]
> > >        validate_chain kernel/locking/lockdep.c:3831 [inline]
> > >        __lock_acquire+0x2a43/0x56d0 kernel/locking/lockdep.c:5055
> > >        lock_acquire kernel/locking/lockdep.c:5668 [inline]
> > >        lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633
> > >        lock_sock_nested+0x3a/0xf0 net/core/sock.c:3470
> > >        lock_sock include/net/sock.h:1725 [inline]
> > >        rfcomm_sk_state_change+0x6d/0x3a0 net/bluetooth/rfcomm/sock.c:73
> > >        __rfcomm_dlc_close+0x1b1/0x890 net/bluetooth/rfcomm/core.c:489
> > >        rfcomm_dlc_close+0x1e9/0x240 net/bluetooth/rfcomm/core.c:520
> > >        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
> > >        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
> > >        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
> > >        __sock_release+0xcd/0x280 net/socket.c:650
> > >        sock_close+0x1c/0x20 net/socket.c:1365
> > >        __fput+0x27c/0xa90 fs/file_table.c:320
> > >        task_work_run+0x16f/0x270 kernel/task_work.c:179
> > >        exit_task_work include/linux/task_work.h:38 [inline]
> > >        do_exit+0xaa8/0x2950 kernel/exit.c:867
> > >        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
> > >        get_signal+0x21c3/0x2450 kernel/signal.c:2859
> > >        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
> > >        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
> > >        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
> > >        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
> > >        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
> > >        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
> > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > 
> > > Signed-off-by: Ying Hsu <yinghsu@chromium.org>
> > > ---
> > > This commit has been tested with a C reproducer on qemu-x86_64.
> > > 
> > >  net/bluetooth/rfcomm/sock.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
> > > index 21e24da4847f..29f9a88a3dc8 100644
> > > --- a/net/bluetooth/rfcomm/sock.c
> > > +++ b/net/bluetooth/rfcomm/sock.c
> > > @@ -410,8 +410,10 @@ static int rfcomm_sock_connect(struct socket *sock, struct sockaddr *addr, int a
> > >  	d->sec_level = rfcomm_pi(sk)->sec_level;
> > >  	d->role_switch = rfcomm_pi(sk)->role_switch;
> > > 
> > > +	release_sock(sk);
> > >  	err = rfcomm_dlc_open(d, &rfcomm_pi(sk)->src, &sa->rc_bdaddr,
> >                                           ^^^^
> > Are you sure that "sk" still exists here after you called to release_sock(sk)?
> > What prevents from use-after-free here?
> > 
> 
> sk must be valid to be locked in first place.

It is, but after it is released it won't.

> release_sock() has mutex unlock semantics so it doesn't free anything..

What do you mean?

I see a lot of magic release calls.

  3481 void release_sock(struct sock *sk)
  3482 {
  3483         spin_lock_bh(&sk->sk_lock.slock);
  3484         if (sk->sk_backlog.tail)
  3485                 __release_sock(sk);
  3486
  3487         /* Warning : release_cb() might need to release sk ownership,
  3488          * ie call sock_release_ownership(sk) before us.
  3489          */
  3490         if (sk->sk_prot->release_cb)
  3491                 sk->sk_prot->release_cb(sk);
  3492
  3493         sock_release_ownership(sk);
  3494         if (waitqueue_active(&sk->sk_lock.wq))
  3495                 wake_up(&sk->sk_lock.wq);
  3496         spin_unlock_bh(&sk->sk_lock.slock);
  3497 }
  3498 EXPORT_SYMBOL(release_sock);

Thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
  2023-01-08 10:12     ` Leon Romanovsky
@ 2023-01-10  9:07       ` Saeed Mahameed
  2023-01-10 12:22         ` Leon Romanovsky
  2023-01-11  0:34         ` Luiz Augusto von Dentz
  0 siblings, 2 replies; 9+ messages in thread
From: Saeed Mahameed @ 2023-01-10  9:07 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Ying Hsu, linux-bluetooth, chromeos-bluetooth-upstreaming,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Johan Hedberg,
	Luiz Augusto von Dentz, Marcel Holtmann, Paolo Abeni,
	linux-kernel, netdev

On 08 Jan 12:12, Leon Romanovsky wrote:
>On Thu, Jan 05, 2023 at 04:24:10PM -0800, Saeed Mahameed wrote:
>> On 04 Jan 08:51, Leon Romanovsky wrote:
>> > On Tue, Jan 03, 2023 at 11:12:46AM +0000, Ying Hsu wrote:
>> > > There's a possible deadlock when two processes are connecting
>> > > and closing concurrently:
>> > >   + CPU0: __rfcomm_dlc_close locks rfcomm and then calls
>> > >   rfcomm_sk_state_change which locks the sock.
>> > >   + CPU1: rfcomm_sock_connect locks the socket and then calls
>> > >   rfcomm_dlc_open which locks rfcomm.
>> > >
>> > > Here's the call trace:
>> > >
>> > > -> #2 (&d->lock){+.+.}-{3:3}:
>> > >        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
>> > >        __mutex_lock0x12f/0x1360 kernel/locking/mutex.c:747
>> > >        __rfcomm_dlc_close+0x15d/0x890 net/bluetooth/rfcomm/core.c:487
>> > >        rfcomm_dlc_close+1e9/0x240 net/bluetooth/rfcomm/core.c:520
>> > >        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
>> > >        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
>> > >        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
>> > >        __sock_release+0xcd/0x280 net/socket.c:650
>> > >        sock_close+0x1c/0x20 net/socket.c:1365
>> > >        __fput+0x27c/0xa90 fs/file_table.c:320
>> > >        task_work_run+0x16f/0x270 kernel/task_work.c:179
>> > >        exit_task_work include/linux/task_work.h:38 [inline]
>> > >        do_exit+0xaa8/0x2950 kernel/exit.c:867
>> > >        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
>> > >        get_signal+0x21c3/0x2450 kernel/signal.c:2859
>> > >        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
>> > >        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>> > >        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
>> > >        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>> > >        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>> > >        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>> > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> > >
>> > > -> #1 (rfcomm_mutex){+.+.}-{3:3}:
>> > >        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
>> > >        __mutex_lock+0x12f/0x1360 kernel/locking/mutex.c:747
>> > >        rfcomm_dlc_open+0x93/0xa80 net/bluetooth/rfcomm/core.c:425
>> > >        rfcomm_sock_connect+0x329/0x450 net/bluetooth/rfcomm/sock.c:413
>> > >        __sys_connect_file+0x153/0x1a0 net/socket.c:1976
>> > >        __sys_connect+0x165/0x1a0 net/socket.c:1993
>> > >        __do_sys_connect net/socket.c:2003 [inline]
>> > >        __se_sys_connect net/socket.c:2000 [inline]
>> > >        __x64_sys_connect+0x73/0xb0 net/socket.c:2000
>> > >        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>> > >        do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
>> > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> > >
>> > > -> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}:
>> > >        check_prev_add kernel/locking/lockdep.c:3097 [inline]
>> > >        check_prevs_add kernel/locking/lockdep.c:3216 [inline]
>> > >        validate_chain kernel/locking/lockdep.c:3831 [inline]
>> > >        __lock_acquire+0x2a43/0x56d0 kernel/locking/lockdep.c:5055
>> > >        lock_acquire kernel/locking/lockdep.c:5668 [inline]
>> > >        lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633
>> > >        lock_sock_nested+0x3a/0xf0 net/core/sock.c:3470
>> > >        lock_sock include/net/sock.h:1725 [inline]
>> > >        rfcomm_sk_state_change+0x6d/0x3a0 net/bluetooth/rfcomm/sock.c:73
>> > >        __rfcomm_dlc_close+0x1b1/0x890 net/bluetooth/rfcomm/core.c:489
>> > >        rfcomm_dlc_close+0x1e9/0x240 net/bluetooth/rfcomm/core.c:520
>> > >        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
>> > >        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
>> > >        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
>> > >        __sock_release+0xcd/0x280 net/socket.c:650
>> > >        sock_close+0x1c/0x20 net/socket.c:1365
>> > >        __fput+0x27c/0xa90 fs/file_table.c:320
>> > >        task_work_run+0x16f/0x270 kernel/task_work.c:179
>> > >        exit_task_work include/linux/task_work.h:38 [inline]
>> > >        do_exit+0xaa8/0x2950 kernel/exit.c:867
>> > >        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
>> > >        get_signal+0x21c3/0x2450 kernel/signal.c:2859
>> > >        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
>> > >        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>> > >        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
>> > >        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>> > >        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>> > >        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>> > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> > >
>> > > Signed-off-by: Ying Hsu <yinghsu@chromium.org>
>> > > ---
>> > > This commit has been tested with a C reproducer on qemu-x86_64.
>> > >
>> > >  net/bluetooth/rfcomm/sock.c | 2 ++
>> > >  1 file changed, 2 insertions(+)
>> > >
>> > > diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
>> > > index 21e24da4847f..29f9a88a3dc8 100644
>> > > --- a/net/bluetooth/rfcomm/sock.c
>> > > +++ b/net/bluetooth/rfcomm/sock.c
>> > > @@ -410,8 +410,10 @@ static int rfcomm_sock_connect(struct socket *sock, struct sockaddr *addr, int a
>> > >  	d->sec_level = rfcomm_pi(sk)->sec_level;
>> > >  	d->role_switch = rfcomm_pi(sk)->role_switch;
>> > >
>> > > +	release_sock(sk);
>> > >  	err = rfcomm_dlc_open(d, &rfcomm_pi(sk)->src, &sa->rc_bdaddr,
>> >                                           ^^^^
>> > Are you sure that "sk" still exists here after you called to release_sock(sk)?
>> > What prevents from use-after-free here?
>> >
>>
>> sk must be valid to be locked in first place.
>
>It is, but after it is released it won't.
>

the code is symmetric: 
you hold the sk lock then do your thing and then release it.

if you claim that sk can be freed by another process after you released it,
then due to symmetry it also can be freed before you locked it, unless 

>> release_sock() has mutex unlock semantics so it doesn't free anything..
>
>What do you mean?
>
>I see a lot of magic release calls.
>
>  3481 void release_sock(struct sock *sk)
>  3482 {
>  3483         spin_lock_bh(&sk->sk_lock.slock);
>  3484         if (sk->sk_backlog.tail)
>  3485                 __release_sock(sk);
>  3486
>  3487         /* Warning : release_cb() might need to release sk ownership,
>  3488          * ie call sock_release_ownership(sk) before us.
>  3489          */
>  3490         if (sk->sk_prot->release_cb)
>  3491                 sk->sk_prot->release_cb(sk);
>  3492
>  3493         sock_release_ownership(sk);
>  3494         if (waitqueue_active(&sk->sk_lock.wq))
>  3495                 wake_up(&sk->sk_lock.wq);
          sk must b  still valid here :), so there was no free to sk object
>  3496         spin_unlock_bh(&sk->sk_lock.slock);
>  3497 }
>  3498 EXPORT_SYMBOL(release_sock);
>
>Thanks


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
  2023-01-10  9:07       ` Saeed Mahameed
@ 2023-01-10 12:22         ` Leon Romanovsky
  2023-01-11  1:03           ` Saeed Mahameed
  2023-01-11  0:34         ` Luiz Augusto von Dentz
  1 sibling, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2023-01-10 12:22 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Ying Hsu, linux-bluetooth, chromeos-bluetooth-upstreaming,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Johan Hedberg,
	Luiz Augusto von Dentz, Marcel Holtmann, Paolo Abeni,
	linux-kernel, netdev

On Tue, Jan 10, 2023 at 01:07:45AM -0800, Saeed Mahameed wrote:
> On 08 Jan 12:12, Leon Romanovsky wrote:
> > On Thu, Jan 05, 2023 at 04:24:10PM -0800, Saeed Mahameed wrote:
> > > On 04 Jan 08:51, Leon Romanovsky wrote:
> > > > On Tue, Jan 03, 2023 at 11:12:46AM +0000, Ying Hsu wrote:
> > > > > There's a possible deadlock when two processes are connecting
> > > > > and closing concurrently:
> > > > >   + CPU0: __rfcomm_dlc_close locks rfcomm and then calls
> > > > >   rfcomm_sk_state_change which locks the sock.
> > > > >   + CPU1: rfcomm_sock_connect locks the socket and then calls
> > > > >   rfcomm_dlc_open which locks rfcomm.
> > > > >
> > > > > Here's the call trace:
> > > > >
> > > > > -> #2 (&d->lock){+.+.}-{3:3}:
> > > > >        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
> > > > >        __mutex_lock0x12f/0x1360 kernel/locking/mutex.c:747
> > > > >        __rfcomm_dlc_close+0x15d/0x890 net/bluetooth/rfcomm/core.c:487
> > > > >        rfcomm_dlc_close+1e9/0x240 net/bluetooth/rfcomm/core.c:520
> > > > >        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
> > > > >        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
> > > > >        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
> > > > >        __sock_release+0xcd/0x280 net/socket.c:650
> > > > >        sock_close+0x1c/0x20 net/socket.c:1365
> > > > >        __fput+0x27c/0xa90 fs/file_table.c:320
> > > > >        task_work_run+0x16f/0x270 kernel/task_work.c:179
> > > > >        exit_task_work include/linux/task_work.h:38 [inline]
> > > > >        do_exit+0xaa8/0x2950 kernel/exit.c:867
> > > > >        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
> > > > >        get_signal+0x21c3/0x2450 kernel/signal.c:2859
> > > > >        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
> > > > >        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
> > > > >        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
> > > > >        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
> > > > >        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
> > > > >        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
> > > > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > > >
> > > > > -> #1 (rfcomm_mutex){+.+.}-{3:3}:
> > > > >        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
> > > > >        __mutex_lock+0x12f/0x1360 kernel/locking/mutex.c:747
> > > > >        rfcomm_dlc_open+0x93/0xa80 net/bluetooth/rfcomm/core.c:425
> > > > >        rfcomm_sock_connect+0x329/0x450 net/bluetooth/rfcomm/sock.c:413
> > > > >        __sys_connect_file+0x153/0x1a0 net/socket.c:1976
> > > > >        __sys_connect+0x165/0x1a0 net/socket.c:1993
> > > > >        __do_sys_connect net/socket.c:2003 [inline]
> > > > >        __se_sys_connect net/socket.c:2000 [inline]
> > > > >        __x64_sys_connect+0x73/0xb0 net/socket.c:2000
> > > > >        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > > > >        do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> > > > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > > >
> > > > > -> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}:
> > > > >        check_prev_add kernel/locking/lockdep.c:3097 [inline]
> > > > >        check_prevs_add kernel/locking/lockdep.c:3216 [inline]
> > > > >        validate_chain kernel/locking/lockdep.c:3831 [inline]
> > > > >        __lock_acquire+0x2a43/0x56d0 kernel/locking/lockdep.c:5055
> > > > >        lock_acquire kernel/locking/lockdep.c:5668 [inline]
> > > > >        lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633
> > > > >        lock_sock_nested+0x3a/0xf0 net/core/sock.c:3470
> > > > >        lock_sock include/net/sock.h:1725 [inline]
> > > > >        rfcomm_sk_state_change+0x6d/0x3a0 net/bluetooth/rfcomm/sock.c:73
> > > > >        __rfcomm_dlc_close+0x1b1/0x890 net/bluetooth/rfcomm/core.c:489
> > > > >        rfcomm_dlc_close+0x1e9/0x240 net/bluetooth/rfcomm/core.c:520
> > > > >        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
> > > > >        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
> > > > >        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
> > > > >        __sock_release+0xcd/0x280 net/socket.c:650
> > > > >        sock_close+0x1c/0x20 net/socket.c:1365
> > > > >        __fput+0x27c/0xa90 fs/file_table.c:320
> > > > >        task_work_run+0x16f/0x270 kernel/task_work.c:179
> > > > >        exit_task_work include/linux/task_work.h:38 [inline]
> > > > >        do_exit+0xaa8/0x2950 kernel/exit.c:867
> > > > >        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
> > > > >        get_signal+0x21c3/0x2450 kernel/signal.c:2859
> > > > >        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
> > > > >        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
> > > > >        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
> > > > >        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
> > > > >        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
> > > > >        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
> > > > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > > >
> > > > > Signed-off-by: Ying Hsu <yinghsu@chromium.org>
> > > > > ---
> > > > > This commit has been tested with a C reproducer on qemu-x86_64.
> > > > >
> > > > >  net/bluetooth/rfcomm/sock.c | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > >
> > > > > diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
> > > > > index 21e24da4847f..29f9a88a3dc8 100644
> > > > > --- a/net/bluetooth/rfcomm/sock.c
> > > > > +++ b/net/bluetooth/rfcomm/sock.c
> > > > > @@ -410,8 +410,10 @@ static int rfcomm_sock_connect(struct socket *sock, struct sockaddr *addr, int a
> > > > >  	d->sec_level = rfcomm_pi(sk)->sec_level;
> > > > >  	d->role_switch = rfcomm_pi(sk)->role_switch;
> > > > >
> > > > > +	release_sock(sk);
> > > > >  	err = rfcomm_dlc_open(d, &rfcomm_pi(sk)->src, &sa->rc_bdaddr,
> > > >                                           ^^^^
> > > > Are you sure that "sk" still exists here after you called to release_sock(sk)?
> > > > What prevents from use-after-free here?
> > > >
> > > 
> > > sk must be valid to be locked in first place.
> > 
> > It is, but after it is released it won't.
> > 
> 
> the code is symmetric: you hold the sk lock then do your thing and then
> release it.
> 
> if you claim that sk can be freed by another process after you released it,
> then due to symmetry it also can be freed before you locked it, unless

So we can extend your logic and say what the lock_sock() in the beginning of
rfcomm_sock_connect() is not needed too.

Thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
  2023-01-10  9:07       ` Saeed Mahameed
  2023-01-10 12:22         ` Leon Romanovsky
@ 2023-01-11  0:34         ` Luiz Augusto von Dentz
  2023-01-11  0:50           ` Saeed Mahameed
  1 sibling, 1 reply; 9+ messages in thread
From: Luiz Augusto von Dentz @ 2023-01-11  0:34 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Leon Romanovsky, Ying Hsu, linux-bluetooth,
	chromeos-bluetooth-upstreaming, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Johan Hedberg, Marcel Holtmann, Paolo Abeni,
	linux-kernel, netdev

Hi Saeed,

On Tue, Jan 10, 2023 at 1:07 AM Saeed Mahameed <saeed@kernel.org> wrote:
>
> On 08 Jan 12:12, Leon Romanovsky wrote:
> >On Thu, Jan 05, 2023 at 04:24:10PM -0800, Saeed Mahameed wrote:
> >> On 04 Jan 08:51, Leon Romanovsky wrote:
> >> > On Tue, Jan 03, 2023 at 11:12:46AM +0000, Ying Hsu wrote:
> >> > > There's a possible deadlock when two processes are connecting
> >> > > and closing concurrently:
> >> > >   + CPU0: __rfcomm_dlc_close locks rfcomm and then calls
> >> > >   rfcomm_sk_state_change which locks the sock.
> >> > >   + CPU1: rfcomm_sock_connect locks the socket and then calls
> >> > >   rfcomm_dlc_open which locks rfcomm.
> >> > >
> >> > > Here's the call trace:
> >> > >
> >> > > -> #2 (&d->lock){+.+.}-{3:3}:
> >> > >        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
> >> > >        __mutex_lock0x12f/0x1360 kernel/locking/mutex.c:747
> >> > >        __rfcomm_dlc_close+0x15d/0x890 net/bluetooth/rfcomm/core.c:487
> >> > >        rfcomm_dlc_close+1e9/0x240 net/bluetooth/rfcomm/core.c:520
> >> > >        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
> >> > >        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
> >> > >        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
> >> > >        __sock_release+0xcd/0x280 net/socket.c:650
> >> > >        sock_close+0x1c/0x20 net/socket.c:1365
> >> > >        __fput+0x27c/0xa90 fs/file_table.c:320
> >> > >        task_work_run+0x16f/0x270 kernel/task_work.c:179
> >> > >        exit_task_work include/linux/task_work.h:38 [inline]
> >> > >        do_exit+0xaa8/0x2950 kernel/exit.c:867
> >> > >        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
> >> > >        get_signal+0x21c3/0x2450 kernel/signal.c:2859
> >> > >        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
> >> > >        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
> >> > >        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
> >> > >        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
> >> > >        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
> >> > >        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
> >> > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> >> > >
> >> > > -> #1 (rfcomm_mutex){+.+.}-{3:3}:
> >> > >        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
> >> > >        __mutex_lock+0x12f/0x1360 kernel/locking/mutex.c:747
> >> > >        rfcomm_dlc_open+0x93/0xa80 net/bluetooth/rfcomm/core.c:425
> >> > >        rfcomm_sock_connect+0x329/0x450 net/bluetooth/rfcomm/sock.c:413
> >> > >        __sys_connect_file+0x153/0x1a0 net/socket.c:1976
> >> > >        __sys_connect+0x165/0x1a0 net/socket.c:1993
> >> > >        __do_sys_connect net/socket.c:2003 [inline]
> >> > >        __se_sys_connect net/socket.c:2000 [inline]
> >> > >        __x64_sys_connect+0x73/0xb0 net/socket.c:2000
> >> > >        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> >> > >        do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> >> > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> >> > >
> >> > > -> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}:
> >> > >        check_prev_add kernel/locking/lockdep.c:3097 [inline]
> >> > >        check_prevs_add kernel/locking/lockdep.c:3216 [inline]
> >> > >        validate_chain kernel/locking/lockdep.c:3831 [inline]
> >> > >        __lock_acquire+0x2a43/0x56d0 kernel/locking/lockdep.c:5055
> >> > >        lock_acquire kernel/locking/lockdep.c:5668 [inline]
> >> > >        lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633
> >> > >        lock_sock_nested+0x3a/0xf0 net/core/sock.c:3470
> >> > >        lock_sock include/net/sock.h:1725 [inline]
> >> > >        rfcomm_sk_state_change+0x6d/0x3a0 net/bluetooth/rfcomm/sock.c:73
> >> > >        __rfcomm_dlc_close+0x1b1/0x890 net/bluetooth/rfcomm/core.c:489
> >> > >        rfcomm_dlc_close+0x1e9/0x240 net/bluetooth/rfcomm/core.c:520
> >> > >        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
> >> > >        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
> >> > >        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
> >> > >        __sock_release+0xcd/0x280 net/socket.c:650
> >> > >        sock_close+0x1c/0x20 net/socket.c:1365
> >> > >        __fput+0x27c/0xa90 fs/file_table.c:320
> >> > >        task_work_run+0x16f/0x270 kernel/task_work.c:179
> >> > >        exit_task_work include/linux/task_work.h:38 [inline]
> >> > >        do_exit+0xaa8/0x2950 kernel/exit.c:867
> >> > >        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
> >> > >        get_signal+0x21c3/0x2450 kernel/signal.c:2859
> >> > >        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
> >> > >        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
> >> > >        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
> >> > >        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
> >> > >        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
> >> > >        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
> >> > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
> >> > >
> >> > > Signed-off-by: Ying Hsu <yinghsu@chromium.org>
> >> > > ---
> >> > > This commit has been tested with a C reproducer on qemu-x86_64.
> >> > >
> >> > >  net/bluetooth/rfcomm/sock.c | 2 ++
> >> > >  1 file changed, 2 insertions(+)
> >> > >
> >> > > diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
> >> > > index 21e24da4847f..29f9a88a3dc8 100644
> >> > > --- a/net/bluetooth/rfcomm/sock.c
> >> > > +++ b/net/bluetooth/rfcomm/sock.c
> >> > > @@ -410,8 +410,10 @@ static int rfcomm_sock_connect(struct socket *sock, struct sockaddr *addr, int a
> >> > >          d->sec_level = rfcomm_pi(sk)->sec_level;
> >> > >          d->role_switch = rfcomm_pi(sk)->role_switch;
> >> > >
> >> > > +        release_sock(sk);
> >> > >          err = rfcomm_dlc_open(d, &rfcomm_pi(sk)->src, &sa->rc_bdaddr,
> >> >                                           ^^^^
> >> > Are you sure that "sk" still exists here after you called to release_sock(sk)?
> >> > What prevents from use-after-free here?
> >> >
> >>
> >> sk must be valid to be locked in first place.
> >
> >It is, but after it is released it won't.
> >
>
> the code is symmetric:
> you hold the sk lock then do your thing and then release it.
>
> if you claim that sk can be freed by another process after you released it,
> then due to symmetry it also can be freed before you locked it, unless

Apart to cases where the socket fd is passed over to other process the
sk is not shared between process, what most likely we want the
sock_lock is for preventing different threads to perform different
operation simultaneously which most likely would cause unexpected
results (e.g. thread 1 calls connect, thread 2 calls close), which is
why most, if not all, proto_ops implementation do call sock_lock as
soon as possible.

> >> release_sock() has mutex unlock semantics so it doesn't free anything..
> >
> >What do you mean?
> >
> >I see a lot of magic release calls.
> >
> >  3481 void release_sock(struct sock *sk)
> >  3482 {
> >  3483         spin_lock_bh(&sk->sk_lock.slock);
> >  3484         if (sk->sk_backlog.tail)
> >  3485                 __release_sock(sk);
> >  3486
> >  3487         /* Warning : release_cb() might need to release sk ownership,
> >  3488          * ie call sock_release_ownership(sk) before us.
> >  3489          */
> >  3490         if (sk->sk_prot->release_cb)
> >  3491                 sk->sk_prot->release_cb(sk);
> >  3492
> >  3493         sock_release_ownership(sk);
> >  3494         if (waitqueue_active(&sk->sk_lock.wq))
> >  3495                 wake_up(&sk->sk_lock.wq);
>           sk must b  still valid here :), so there was no free to sk object

The sk may be valid but it could wakeup another thread pending on
sock_lock which would then do shutdown/close so the next time this
thread attempts to perform sock_lock the state could have been
changed.

> >  3496         spin_unlock_bh(&sk->sk_lock.slock);
> >  3497 }
> >  3498 EXPORT_SYMBOL(release_sock);
> >
> >Thanks
>


-- 
Luiz Augusto von Dentz

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
  2023-01-11  0:34         ` Luiz Augusto von Dentz
@ 2023-01-11  0:50           ` Saeed Mahameed
  0 siblings, 0 replies; 9+ messages in thread
From: Saeed Mahameed @ 2023-01-11  0:50 UTC (permalink / raw)
  To: Luiz Augusto von Dentz
  Cc: Leon Romanovsky, Ying Hsu, linux-bluetooth,
	chromeos-bluetooth-upstreaming, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Johan Hedberg, Marcel Holtmann, Paolo Abeni,
	linux-kernel, netdev

On 10 Jan 16:34, Luiz Augusto von Dentz wrote:
>Hi Saeed,
>
>On Tue, Jan 10, 2023 at 1:07 AM Saeed Mahameed <saeed@kernel.org> wrote:
>>
>> On 08 Jan 12:12, Leon Romanovsky wrote:
>> >On Thu, Jan 05, 2023 at 04:24:10PM -0800, Saeed Mahameed wrote:
>> >> On 04 Jan 08:51, Leon Romanovsky wrote:
>> >> > On Tue, Jan 03, 2023 at 11:12:46AM +0000, Ying Hsu wrote:
>> >> > > There's a possible deadlock when two processes are connecting
>> >> > > and closing concurrently:
>> >> > >   + CPU0: __rfcomm_dlc_close locks rfcomm and then calls
>> >> > >   rfcomm_sk_state_change which locks the sock.
>> >> > >   + CPU1: rfcomm_sock_connect locks the socket and then calls
>> >> > >   rfcomm_dlc_open which locks rfcomm.
>> >> > >
>> >> > > Here's the call trace:
>> >> > >
>> >> > > -> #2 (&d->lock){+.+.}-{3:3}:
>> >> > >        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
>> >> > >        __mutex_lock0x12f/0x1360 kernel/locking/mutex.c:747
>> >> > >        __rfcomm_dlc_close+0x15d/0x890 net/bluetooth/rfcomm/core.c:487
>> >> > >        rfcomm_dlc_close+1e9/0x240 net/bluetooth/rfcomm/core.c:520
>> >> > >        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
>> >> > >        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
>> >> > >        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
>> >> > >        __sock_release+0xcd/0x280 net/socket.c:650
>> >> > >        sock_close+0x1c/0x20 net/socket.c:1365
>> >> > >        __fput+0x27c/0xa90 fs/file_table.c:320
>> >> > >        task_work_run+0x16f/0x270 kernel/task_work.c:179
>> >> > >        exit_task_work include/linux/task_work.h:38 [inline]
>> >> > >        do_exit+0xaa8/0x2950 kernel/exit.c:867
>> >> > >        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
>> >> > >        get_signal+0x21c3/0x2450 kernel/signal.c:2859
>> >> > >        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
>> >> > >        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>> >> > >        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
>> >> > >        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>> >> > >        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>> >> > >        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>> >> > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> >> > >
>> >> > > -> #1 (rfcomm_mutex){+.+.}-{3:3}:
>> >> > >        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
>> >> > >        __mutex_lock+0x12f/0x1360 kernel/locking/mutex.c:747
>> >> > >        rfcomm_dlc_open+0x93/0xa80 net/bluetooth/rfcomm/core.c:425
>> >> > >        rfcomm_sock_connect+0x329/0x450 net/bluetooth/rfcomm/sock.c:413
>> >> > >        __sys_connect_file+0x153/0x1a0 net/socket.c:1976
>> >> > >        __sys_connect+0x165/0x1a0 net/socket.c:1993
>> >> > >        __do_sys_connect net/socket.c:2003 [inline]
>> >> > >        __se_sys_connect net/socket.c:2000 [inline]
>> >> > >        __x64_sys_connect+0x73/0xb0 net/socket.c:2000
>> >> > >        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>> >> > >        do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
>> >> > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> >> > >
>> >> > > -> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}:
>> >> > >        check_prev_add kernel/locking/lockdep.c:3097 [inline]
>> >> > >        check_prevs_add kernel/locking/lockdep.c:3216 [inline]
>> >> > >        validate_chain kernel/locking/lockdep.c:3831 [inline]
>> >> > >        __lock_acquire+0x2a43/0x56d0 kernel/locking/lockdep.c:5055
>> >> > >        lock_acquire kernel/locking/lockdep.c:5668 [inline]
>> >> > >        lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633
>> >> > >        lock_sock_nested+0x3a/0xf0 net/core/sock.c:3470
>> >> > >        lock_sock include/net/sock.h:1725 [inline]
>> >> > >        rfcomm_sk_state_change+0x6d/0x3a0 net/bluetooth/rfcomm/sock.c:73
>> >> > >        __rfcomm_dlc_close+0x1b1/0x890 net/bluetooth/rfcomm/core.c:489
>> >> > >        rfcomm_dlc_close+0x1e9/0x240 net/bluetooth/rfcomm/core.c:520
>> >> > >        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
>> >> > >        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
>> >> > >        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
>> >> > >        __sock_release+0xcd/0x280 net/socket.c:650
>> >> > >        sock_close+0x1c/0x20 net/socket.c:1365
>> >> > >        __fput+0x27c/0xa90 fs/file_table.c:320
>> >> > >        task_work_run+0x16f/0x270 kernel/task_work.c:179
>> >> > >        exit_task_work include/linux/task_work.h:38 [inline]
>> >> > >        do_exit+0xaa8/0x2950 kernel/exit.c:867
>> >> > >        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
>> >> > >        get_signal+0x21c3/0x2450 kernel/signal.c:2859
>> >> > >        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
>> >> > >        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>> >> > >        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
>> >> > >        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>> >> > >        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>> >> > >        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>> >> > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> >> > >
>> >> > > Signed-off-by: Ying Hsu <yinghsu@chromium.org>
>> >> > > ---
>> >> > > This commit has been tested with a C reproducer on qemu-x86_64.
>> >> > >
>> >> > >  net/bluetooth/rfcomm/sock.c | 2 ++
>> >> > >  1 file changed, 2 insertions(+)
>> >> > >
>> >> > > diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
>> >> > > index 21e24da4847f..29f9a88a3dc8 100644
>> >> > > --- a/net/bluetooth/rfcomm/sock.c
>> >> > > +++ b/net/bluetooth/rfcomm/sock.c
>> >> > > @@ -410,8 +410,10 @@ static int rfcomm_sock_connect(struct socket *sock, struct sockaddr *addr, int a
>> >> > >          d->sec_level = rfcomm_pi(sk)->sec_level;
>> >> > >          d->role_switch = rfcomm_pi(sk)->role_switch;
>> >> > >
>> >> > > +        release_sock(sk);
>> >> > >          err = rfcomm_dlc_open(d, &rfcomm_pi(sk)->src, &sa->rc_bdaddr,
>> >> >                                           ^^^^
>> >> > Are you sure that "sk" still exists here after you called to release_sock(sk)?
>> >> > What prevents from use-after-free here?
>> >> >
>> >>
>> >> sk must be valid to be locked in first place.
>> >
>> >It is, but after it is released it won't.
>> >
>>
>> the code is symmetric:
>> you hold the sk lock then do your thing and then release it.
>>
>> if you claim that sk can be freed by another process after you released it,
>> then due to symmetry it also can be freed before you locked it, unless
>
>Apart to cases where the socket fd is passed over to other process the
>sk is not shared between process, what most likely we want the
>sock_lock is for preventing different threads to perform different
>operation simultaneously which most likely would cause unexpected
>results (e.g. thread 1 calls connect, thread 2 calls close), which is
>why most, if not all, proto_ops implementation do call sock_lock as
>soon as possible.
>

yes, this what i am trying to explain that the sock_lock is used for
serializing the socket state and not its life cycle.

>> >> release_sock() has mutex unlock semantics so it doesn't free anything..
>> >
>> >What do you mean?
>> >
>> >I see a lot of magic release calls.
>> >
>> >  3481 void release_sock(struct sock *sk)
>> >  3482 {
>> >  3483         spin_lock_bh(&sk->sk_lock.slock);
>> >  3484         if (sk->sk_backlog.tail)
>> >  3485                 __release_sock(sk);
>> >  3486
>> >  3487         /* Warning : release_cb() might need to release sk ownership,
>> >  3488          * ie call sock_release_ownership(sk) before us.
>> >  3489          */
>> >  3490         if (sk->sk_prot->release_cb)
>> >  3491                 sk->sk_prot->release_cb(sk);
>> >  3492
>> >  3493         sock_release_ownership(sk);
>> >  3494         if (waitqueue_active(&sk->sk_lock.wq))
>> >  3495                 wake_up(&sk->sk_lock.wq);
>>           sk must b  still valid here :), so there was no free to sk object
>
>The sk may be valid but it could wakeup another thread pending on
>sock_lock which would then do shutdown/close so the next time this
>thread attempts to perform sock_lock the state could have been
>changed.
>

this is my point, but instead of the pending thread waking up after you
released the sock for the first time, imagine if it managed to grab
the lock and do the shutdown one tick before you attempt the first
sock_lock() in rfcomm_sock_connect()


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Bluetooth: Fix possible deadlock in rfcomm_sk_state_change
  2023-01-10 12:22         ` Leon Romanovsky
@ 2023-01-11  1:03           ` Saeed Mahameed
  0 siblings, 0 replies; 9+ messages in thread
From: Saeed Mahameed @ 2023-01-11  1:03 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Ying Hsu, linux-bluetooth, chromeos-bluetooth-upstreaming,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Johan Hedberg,
	Luiz Augusto von Dentz, Marcel Holtmann, Paolo Abeni,
	linux-kernel, netdev

On 10 Jan 14:22, Leon Romanovsky wrote:
>On Tue, Jan 10, 2023 at 01:07:45AM -0800, Saeed Mahameed wrote:
>> On 08 Jan 12:12, Leon Romanovsky wrote:
>> > On Thu, Jan 05, 2023 at 04:24:10PM -0800, Saeed Mahameed wrote:
>> > > On 04 Jan 08:51, Leon Romanovsky wrote:
>> > > > On Tue, Jan 03, 2023 at 11:12:46AM +0000, Ying Hsu wrote:
>> > > > > There's a possible deadlock when two processes are connecting
>> > > > > and closing concurrently:
>> > > > >   + CPU0: __rfcomm_dlc_close locks rfcomm and then calls
>> > > > >   rfcomm_sk_state_change which locks the sock.
>> > > > >   + CPU1: rfcomm_sock_connect locks the socket and then calls
>> > > > >   rfcomm_dlc_open which locks rfcomm.
>> > > > >
>> > > > > Here's the call trace:
>> > > > >
>> > > > > -> #2 (&d->lock){+.+.}-{3:3}:
>> > > > >        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
>> > > > >        __mutex_lock0x12f/0x1360 kernel/locking/mutex.c:747
>> > > > >        __rfcomm_dlc_close+0x15d/0x890 net/bluetooth/rfcomm/core.c:487
>> > > > >        rfcomm_dlc_close+1e9/0x240 net/bluetooth/rfcomm/core.c:520
>> > > > >        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
>> > > > >        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
>> > > > >        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
>> > > > >        __sock_release+0xcd/0x280 net/socket.c:650
>> > > > >        sock_close+0x1c/0x20 net/socket.c:1365
>> > > > >        __fput+0x27c/0xa90 fs/file_table.c:320
>> > > > >        task_work_run+0x16f/0x270 kernel/task_work.c:179
>> > > > >        exit_task_work include/linux/task_work.h:38 [inline]
>> > > > >        do_exit+0xaa8/0x2950 kernel/exit.c:867
>> > > > >        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
>> > > > >        get_signal+0x21c3/0x2450 kernel/signal.c:2859
>> > > > >        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
>> > > > >        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>> > > > >        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
>> > > > >        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>> > > > >        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>> > > > >        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>> > > > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> > > > >
>> > > > > -> #1 (rfcomm_mutex){+.+.}-{3:3}:
>> > > > >        __mutex_lock_common kernel/locking/mutex.c:603 [inline]
>> > > > >        __mutex_lock+0x12f/0x1360 kernel/locking/mutex.c:747
>> > > > >        rfcomm_dlc_open+0x93/0xa80 net/bluetooth/rfcomm/core.c:425
>> > > > >        rfcomm_sock_connect+0x329/0x450 net/bluetooth/rfcomm/sock.c:413
>> > > > >        __sys_connect_file+0x153/0x1a0 net/socket.c:1976
>> > > > >        __sys_connect+0x165/0x1a0 net/socket.c:1993
>> > > > >        __do_sys_connect net/socket.c:2003 [inline]
>> > > > >        __se_sys_connect net/socket.c:2000 [inline]
>> > > > >        __x64_sys_connect+0x73/0xb0 net/socket.c:2000
>> > > > >        do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>> > > > >        do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
>> > > > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> > > > >
>> > > > > -> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}:
>> > > > >        check_prev_add kernel/locking/lockdep.c:3097 [inline]
>> > > > >        check_prevs_add kernel/locking/lockdep.c:3216 [inline]
>> > > > >        validate_chain kernel/locking/lockdep.c:3831 [inline]
>> > > > >        __lock_acquire+0x2a43/0x56d0 kernel/locking/lockdep.c:5055
>> > > > >        lock_acquire kernel/locking/lockdep.c:5668 [inline]
>> > > > >        lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633
>> > > > >        lock_sock_nested+0x3a/0xf0 net/core/sock.c:3470
>> > > > >        lock_sock include/net/sock.h:1725 [inline]
>> > > > >        rfcomm_sk_state_change+0x6d/0x3a0 net/bluetooth/rfcomm/sock.c:73
>> > > > >        __rfcomm_dlc_close+0x1b1/0x890 net/bluetooth/rfcomm/core.c:489
>> > > > >        rfcomm_dlc_close+0x1e9/0x240 net/bluetooth/rfcomm/core.c:520
>> > > > >        __rfcomm_sock_close+0x13c/0x250 net/bluetooth/rfcomm/sock.c:220
>> > > > >        rfcomm_sock_shutdown+0xd8/0x230 net/bluetooth/rfcomm/sock.c:907
>> > > > >        rfcomm_sock_release+0x68/0x140 net/bluetooth/rfcomm/sock.c:928
>> > > > >        __sock_release+0xcd/0x280 net/socket.c:650
>> > > > >        sock_close+0x1c/0x20 net/socket.c:1365
>> > > > >        __fput+0x27c/0xa90 fs/file_table.c:320
>> > > > >        task_work_run+0x16f/0x270 kernel/task_work.c:179
>> > > > >        exit_task_work include/linux/task_work.h:38 [inline]
>> > > > >        do_exit+0xaa8/0x2950 kernel/exit.c:867
>> > > > >        do_group_exit+0xd4/0x2a0 kernel/exit.c:1012
>> > > > >        get_signal+0x21c3/0x2450 kernel/signal.c:2859
>> > > > >        arch_do_signal_or_restart+0x79/0x5c0 arch/x86/kernel/signal.c:306
>> > > > >        exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>> > > > >        exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
>> > > > >        __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>> > > > >        syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>> > > > >        do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>> > > > >        entry_SYSCALL_64_after_hwframe+0x63/0xcd
>> > > > >
>> > > > > Signed-off-by: Ying Hsu <yinghsu@chromium.org>
>> > > > > ---
>> > > > > This commit has been tested with a C reproducer on qemu-x86_64.
>> > > > >
>> > > > >  net/bluetooth/rfcomm/sock.c | 2 ++
>> > > > >  1 file changed, 2 insertions(+)
>> > > > >
>> > > > > diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
>> > > > > index 21e24da4847f..29f9a88a3dc8 100644
>> > > > > --- a/net/bluetooth/rfcomm/sock.c
>> > > > > +++ b/net/bluetooth/rfcomm/sock.c
>> > > > > @@ -410,8 +410,10 @@ static int rfcomm_sock_connect(struct socket *sock, struct sockaddr *addr, int a
>> > > > >  	d->sec_level = rfcomm_pi(sk)->sec_level;
>> > > > >  	d->role_switch = rfcomm_pi(sk)->role_switch;
>> > > > >
>> > > > > +	release_sock(sk);
>> > > > >  	err = rfcomm_dlc_open(d, &rfcomm_pi(sk)->src, &sa->rc_bdaddr,
>> > > >                                           ^^^^
>> > > > Are you sure that "sk" still exists here after you called to release_sock(sk)?
>> > > > What prevents from use-after-free here?
>> > > >
>> > >
>> > > sk must be valid to be locked in first place.
>> >
>> > It is, but after it is released it won't.
>> >
>>
>> the code is symmetric: you hold the sk lock then do your thing and then
>> release it.
>>
>> if you claim that sk can be freed by another process after you released it,
>> then due to symmetry it also can be freed before you locked it, unless
>
>So we can extend your logic and say what the lock_sock() in the beginning of
>rfcomm_sock_connect() is not needed too.
>

No, it is needed to synchronize the sk state as Luiz explained. 

consider the following:
<--- point A
lock(object);
do_something(object);
unlock(object);
<--- point B

if object can be freed at point B then it's also subject to being freed
at point A, unless do_something(); releases an extra reference.. 
which i am sure is not the case here.

So a higher level synchronization (ref counting) is required to manage the
sk life cycle .. e.g sock_hold/sock_put, but please read the comment on
sock_hold, there are some specific conditions to guarantee sk validity when
using sock_hold()

  


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-01-11  1:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-03 11:12 [PATCH] Bluetooth: Fix possible deadlock in rfcomm_sk_state_change Ying Hsu
2023-01-04  6:51 ` Leon Romanovsky
2023-01-06  0:24   ` Saeed Mahameed
2023-01-08 10:12     ` Leon Romanovsky
2023-01-10  9:07       ` Saeed Mahameed
2023-01-10 12:22         ` Leon Romanovsky
2023-01-11  1:03           ` Saeed Mahameed
2023-01-11  0:34         ` Luiz Augusto von Dentz
2023-01-11  0:50           ` Saeed Mahameed

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).