* A concurrency bug between l2tp_tunnel_register() and l2tp_xmit_core()
@ 2021-04-13 17:30 Gong, Sishuai
2021-04-14 19:37 ` Tom Parkin
2021-04-14 20:07 ` Cong Wang
0 siblings, 2 replies; 4+ messages in thread
From: Gong, Sishuai @ 2021-04-13 17:30 UTC (permalink / raw)
To: jchapman, tparkin; +Cc: netdev
Hi,
We found a concurrency bug in linux 5.12-rc3 and we are able to reproduce it under x86. This bug happens when two l2tp functions l2tp_tunnel_register() and l2tp_xmit_core() are running in parallel. In general, l2tp_tunnel_register() registered a tunnel that hasn’t been fully initialized and then l2tp_xmit_core() tries to access an uninitialized attribute. The interleaving is shown below..
------------------------------------------
Execution interleaving
Thread 1 Thread 2
l2tp_tunnel_register()
spin_lock_bh(&pn->l2tp_tunnel_list_lock);
…
list_add_rcu(&tunnel->list, &pn->l2tp_tunnel_list);
// tunnel becomes visible
spin_unlock_bh(&pn->l2tp_tunnel_list_lock);
pppol2tp_connect()
…
tunnel = l2tp_tunnel_get(sock_net(sk), info.tunnel_id);
// Successfully get the new tunnel
…
l2tp_xmit_core()
struct sock *sk = tunnel->sock;
// uninitialized, sk=0
…
bh_lock_sock(sk);
// Null-pointer exception happens
…
tunnel->sock = sk;
------------------------------------------
Impact & fix
This bug causes a kernel NULL pointer deference error, as attached below. Currently, we think a potential fix is to initialize tunnel->sock before adding the tunnel into l2tp_tunnel_list.
------------------------------------------
Console output
[ 806.566775][T10805] BUG: kernel NULL pointer dereference, address: 00000070
[ 807.097222][T10805] #PF: supervisor read access in kernel mode
[ 807.647927][T10805] #PF: error_code(0x0000) - not-present page
[ 808.255377][T10805] *pde = 00000000
[ 808.757649][T10805] Oops: 0000 [#1] PREEMPT SMP
[ 809.367746][T10805] CPU: 1 PID: 10805 Comm: executor Not tainted 5.12.0-rc3 #3
[ 810.590670][T10805] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 811.126044][T10805] EIP: _raw_spin_lock+0x16/0x50
[ 811.671747][T10805] Code: 00 00 00 00 55 89 d0 89 e5 e8 26 8c 20 fe 5d c3 8d 74 26 00 55 89 c1 89 e5 53 64 ff 05 0c 97 fb c3 31 d2 bb 01 00 00 00 89 d0 <f0> 0f b1 19 75 0c 8b 5d fc c9 c3 8d b4 26
00 00 00 00 8b 15 e8 7c
[ 813.375919][T10805] EAX: 00000000 EBX: 00000001 ECX: 00000070 EDX: 00000000
[ 813.989487][T10805] ESI: cbb59300 EDI: cbac8c00 EBP: cf54fd68 ESP: cf54fd64
[ 814.629205][T10805] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000246
[ 815.811079][T10805] CR0: 80050033 CR2: 00000070 CR3: 0efd3000 CR4: 00000690
[ 816.526951][T10805] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 817.158214][T10805] DR6: 00000000 DR7: 00000000
[ 817.762257][T10805] Call Trace:
[ 818.322192][T10805] l2tp_xmit_skb+0x11a/0x530
[ 818.876097][T10805] pppol2tp_sendmsg+0x160/0x290
[ 819.438224][T10805] sock_sendmsg+0x2d/0x40
[ 820.077999][T10805] ____sys_sendmsg+0x1a2/0x1d0
[ 820.694928][T10805] ? import_iovec+0x13/0x20
[ 821.220194][T10805] ___sys_sendmsg+0x98/0xd0
[ 821.927886][T10805] ? file_update_time+0x4b/0x130
[ 822.458245][T10805] ? vfs_write+0x32c/0x3f0
[ 823.002593][T10805] __sys_sendmsg+0x39/0x80
Thanks,
Sishuai
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: A concurrency bug between l2tp_tunnel_register() and l2tp_xmit_core()
2021-04-13 17:30 A concurrency bug between l2tp_tunnel_register() and l2tp_xmit_core() Gong, Sishuai
@ 2021-04-14 19:37 ` Tom Parkin
2021-04-14 19:53 ` Gong, Sishuai
2021-04-14 20:07 ` Cong Wang
1 sibling, 1 reply; 4+ messages in thread
From: Tom Parkin @ 2021-04-14 19:37 UTC (permalink / raw)
To: Gong, Sishuai; +Cc: jchapman, netdev
[-- Attachment #1: Type: text/plain, Size: 3591 bytes --]
On Tue, Apr 13, 2021 at 17:30:17 +0000, Gong, Sishuai wrote:
> Hi,
>
> We found a concurrency bug in linux 5.12-rc3 and we are able to reproduce it under x86. This bug happens when two l2tp functions l2tp_tunnel_register() and l2tp_xmit_core() are running in parallel. In general, l2tp_tunnel_register() registered a tunnel that hasn’t been fully initialized and then l2tp_xmit_core() tries to access an uninitialized attribute. The interleaving is shown below..
>
> ------------------------------------------
> Execution interleaving
>
> Thread 1 Thread 2
>
> l2tp_tunnel_register()
> spin_lock_bh(&pn->l2tp_tunnel_list_lock);
> …
> list_add_rcu(&tunnel->list, &pn->l2tp_tunnel_list);
> // tunnel becomes visible
> spin_unlock_bh(&pn->l2tp_tunnel_list_lock);
> pppol2tp_connect()
> …
> tunnel = l2tp_tunnel_get(sock_net(sk), info.tunnel_id);
> // Successfully get the new tunnel
> …
> l2tp_xmit_core()
> struct sock *sk = tunnel->sock;
> // uninitialized, sk=0
> …
> bh_lock_sock(sk);
> // Null-pointer exception happens
> …
> tunnel->sock = sk;
>
> ------------------------------------------
> Impact & fix
>
> This bug causes a kernel NULL pointer deference error, as attached below. Currently, we think a potential fix is to initialize tunnel->sock before adding the tunnel into l2tp_tunnel_list.
>
> ------------------------------------------
> Console output
>
> [ 806.566775][T10805] BUG: kernel NULL pointer dereference, address: 00000070
> [ 807.097222][T10805] #PF: supervisor read access in kernel mode
> [ 807.647927][T10805] #PF: error_code(0x0000) - not-present page
> [ 808.255377][T10805] *pde = 00000000
> [ 808.757649][T10805] Oops: 0000 [#1] PREEMPT SMP
> [ 809.367746][T10805] CPU: 1 PID: 10805 Comm: executor Not tainted 5.12.0-rc3 #3
> [ 810.590670][T10805] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> [ 811.126044][T10805] EIP: _raw_spin_lock+0x16/0x50
> [ 811.671747][T10805] Code: 00 00 00 00 55 89 d0 89 e5 e8 26 8c 20 fe 5d c3 8d 74 26 00 55 89 c1 89 e5 53 64 ff 05 0c 97 fb c3 31 d2 bb 01 00 00 00 89 d0 <f0> 0f b1 19 75 0c 8b 5d fc c9 c3 8d b4 26
> 00 00 00 00 8b 15 e8 7c
> [ 813.375919][T10805] EAX: 00000000 EBX: 00000001 ECX: 00000070 EDX: 00000000
> [ 813.989487][T10805] ESI: cbb59300 EDI: cbac8c00 EBP: cf54fd68 ESP: cf54fd64
> [ 814.629205][T10805] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000246
> [ 815.811079][T10805] CR0: 80050033 CR2: 00000070 CR3: 0efd3000 CR4: 00000690
> [ 816.526951][T10805] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [ 817.158214][T10805] DR6: 00000000 DR7: 00000000
> [ 817.762257][T10805] Call Trace:
> [ 818.322192][T10805] l2tp_xmit_skb+0x11a/0x530
> [ 818.876097][T10805] pppol2tp_sendmsg+0x160/0x290
> [ 819.438224][T10805] sock_sendmsg+0x2d/0x40
> [ 820.077999][T10805] ____sys_sendmsg+0x1a2/0x1d0
> [ 820.694928][T10805] ? import_iovec+0x13/0x20
> [ 821.220194][T10805] ___sys_sendmsg+0x98/0xd0
> [ 821.927886][T10805] ? file_update_time+0x4b/0x130
> [ 822.458245][T10805] ? vfs_write+0x32c/0x3f0
> [ 823.002593][T10805] __sys_sendmsg+0x39/0x80
>
>
>
> Thanks,
> Sishuai
>
Hi Sishuai,
Thanks for the report!
Your analysis looks correct to me, and the suggested fix sounds
reasonable too.
Is this something you plan to submit a patch for?
Best regards,
Tom
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: A concurrency bug between l2tp_tunnel_register() and l2tp_xmit_core()
2021-04-14 19:37 ` Tom Parkin
@ 2021-04-14 19:53 ` Gong, Sishuai
0 siblings, 0 replies; 4+ messages in thread
From: Gong, Sishuai @ 2021-04-14 19:53 UTC (permalink / raw)
To: Tom Parkin; +Cc: jchapman, netdev
On Apr 14, 2021, at 3:37 PM, Tom Parkin <tparkin@katalix.com> wrote:
>
> On Tue, Apr 13, 2021 at 17:30:17 +0000, Gong, Sishuai wrote:
>> Hi,
>>
>> We found a concurrency bug in linux 5.12-rc3 and we are able to reproduce it under x86. This bug happens when two l2tp functions l2tp_tunnel_register() and l2tp_xmit_core() are running in parallel. In general, l2tp_tunnel_register() registered a tunnel that hasn’t been fully initialized and then l2tp_xmit_core() tries to access an uninitialized attribute. The interleaving is shown below..
>>
>> ------------------------------------------
>> Execution interleaving
>>
>> Thread 1 Thread 2
>>
>> l2tp_tunnel_register()
>> spin_lock_bh(&pn->l2tp_tunnel_list_lock);
>> …
>> list_add_rcu(&tunnel->list, &pn->l2tp_tunnel_list);
>> // tunnel becomes visible
>> spin_unlock_bh(&pn->l2tp_tunnel_list_lock);
>> pppol2tp_connect()
>> …
>> tunnel = l2tp_tunnel_get(sock_net(sk), info.tunnel_id);
>> // Successfully get the new tunnel
>> …
>> l2tp_xmit_core()
>> struct sock *sk = tunnel->sock;
>> // uninitialized, sk=0
>> …
>> bh_lock_sock(sk);
>> // Null-pointer exception happens
>> …
>> tunnel->sock = sk;
>>
>> ------------------------------------------
>> Impact & fix
>>
>> This bug causes a kernel NULL pointer deference error, as attached below. Currently, we think a potential fix is to initialize tunnel->sock before adding the tunnel into l2tp_tunnel_list.
>>
>> ------------------------------------------
>> Console output
>>
>> [ 806.566775][T10805] BUG: kernel NULL pointer dereference, address: 00000070
>> [ 807.097222][T10805] #PF: supervisor read access in kernel mode
>> [ 807.647927][T10805] #PF: error_code(0x0000) - not-present page
>> [ 808.255377][T10805] *pde = 00000000
>> [ 808.757649][T10805] Oops: 0000 [#1] PREEMPT SMP
>> [ 809.367746][T10805] CPU: 1 PID: 10805 Comm: executor Not tainted 5.12.0-rc3 #3
>> [ 810.590670][T10805] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
>> [ 811.126044][T10805] EIP: _raw_spin_lock+0x16/0x50
>> [ 811.671747][T10805] Code: 00 00 00 00 55 89 d0 89 e5 e8 26 8c 20 fe 5d c3 8d 74 26 00 55 89 c1 89 e5 53 64 ff 05 0c 97 fb c3 31 d2 bb 01 00 00 00 89 d0 <f0> 0f b1 19 75 0c 8b 5d fc c9 c3 8d b4 26
>> 00 00 00 00 8b 15 e8 7c
>> [ 813.375919][T10805] EAX: 00000000 EBX: 00000001 ECX: 00000070 EDX: 00000000
>> [ 813.989487][T10805] ESI: cbb59300 EDI: cbac8c00 EBP: cf54fd68 ESP: cf54fd64
>> [ 814.629205][T10805] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00000246
>> [ 815.811079][T10805] CR0: 80050033 CR2: 00000070 CR3: 0efd3000 CR4: 00000690
>> [ 816.526951][T10805] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
>> [ 817.158214][T10805] DR6: 00000000 DR7: 00000000
>> [ 817.762257][T10805] Call Trace:
>> [ 818.322192][T10805] l2tp_xmit_skb+0x11a/0x530
>> [ 818.876097][T10805] pppol2tp_sendmsg+0x160/0x290
>> [ 819.438224][T10805] sock_sendmsg+0x2d/0x40
>> [ 820.077999][T10805] ____sys_sendmsg+0x1a2/0x1d0
>> [ 820.694928][T10805] ? import_iovec+0x13/0x20
>> [ 821.220194][T10805] ___sys_sendmsg+0x98/0xd0
>> [ 821.927886][T10805] ? file_update_time+0x4b/0x130
>> [ 822.458245][T10805] ? vfs_write+0x32c/0x3f0
>> [ 823.002593][T10805] __sys_sendmsg+0x39/0x80
>>
>>
>>
>> Thanks,
>> Sishuai
>>
>
> Hi Sishuai,
>
> Thanks for the report!
>
> Your analysis looks correct to me, and the suggested fix sounds
> reasonable too.
Thanks, I am glad I could be helpful:)
> Is this something you plan to submit a patch for?
We are not planning to submit a patch for now because we think experienced developer have more comprehensive view than us, but we are very happy to test any potential patches.
>
> Best regards,
> Tom
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: A concurrency bug between l2tp_tunnel_register() and l2tp_xmit_core()
2021-04-13 17:30 A concurrency bug between l2tp_tunnel_register() and l2tp_xmit_core() Gong, Sishuai
2021-04-14 19:37 ` Tom Parkin
@ 2021-04-14 20:07 ` Cong Wang
1 sibling, 0 replies; 4+ messages in thread
From: Cong Wang @ 2021-04-14 20:07 UTC (permalink / raw)
To: Gong, Sishuai; +Cc: jchapman, tparkin, netdev
On Tue, Apr 13, 2021 at 3:10 PM Gong, Sishuai <sishuai@purdue.edu> wrote:
>
> Hi,
>
> We found a concurrency bug in linux 5.12-rc3 and we are able to reproduce it under x86. This bug happens when two l2tp functions l2tp_tunnel_register() and l2tp_xmit_core() are running in parallel. In general, l2tp_tunnel_register() registered a tunnel that hasn’t been fully initialized and then l2tp_xmit_core() tries to access an uninitialized attribute. The interleaving is shown below..
>
> ------------------------------------------
> Execution interleaving
>
> Thread 1 Thread 2
>
> l2tp_tunnel_register()
> spin_lock_bh(&pn->l2tp_tunnel_list_lock);
> …
> list_add_rcu(&tunnel->list, &pn->l2tp_tunnel_list);
> // tunnel becomes visible
> spin_unlock_bh(&pn->l2tp_tunnel_list_lock);
> pppol2tp_connect()
> …
> tunnel = l2tp_tunnel_get(sock_net(sk), info.tunnel_id);
> // Successfully get the new tunnel
> …
> l2tp_xmit_core()
> struct sock *sk = tunnel->sock;
> // uninitialized, sk=0
> …
> bh_lock_sock(sk);
> // Null-pointer exception happens
> …
> tunnel->sock = sk;
>
> ------------------------------------------
> Impact & fix
>
> This bug causes a kernel NULL pointer deference error, as attached below. Currently, we think a potential fix is to initialize tunnel->sock before adding the tunnel into l2tp_tunnel_list.
I think this is the right fix. Please submit a patch formally.
Thanks.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-04-14 20:07 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13 17:30 A concurrency bug between l2tp_tunnel_register() and l2tp_xmit_core() Gong, Sishuai
2021-04-14 19:37 ` Tom Parkin
2021-04-14 19:53 ` Gong, Sishuai
2021-04-14 20:07 ` Cong Wang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.