linux-sctp.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Race of sctp_assoc_control_transport() against sctp_assoc_rm_peer() ?
@ 2020-12-08 20:59 Alexander Sverdlin
  2021-05-12 16:08 ` Xin Long
  0 siblings, 1 reply; 2+ messages in thread
From: Alexander Sverdlin @ 2020-12-08 20:59 UTC (permalink / raw)
  To: linux-sctp
  Cc: Vlad Yasevich, Neil Horman, Marcelo Ricardo Leitner, Petr Malat

Hello!

I'm trying to understand the crash we experience with Linux v4.19
(sorry for the ancient codebase, but actually the affected code is largely
unchanged up to now).

This is hard but reproducible:

general protection fault: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O      4.19.155-g...
...
RIP: 0010:sctp_assoc_control_transport+0x1db/0x290 [sctp]
Code: 01 00 00 00 00 00 00 bd 01 00 00 00 31 c0 48 89 e7 b9 10 00 00 00 f3 48 ab 48 8b 86 a8 00 00 00 48 89 e7 48 81 c6 88 00 00 00 <48> 63 90 bc 00 00 00 e8 29 61 2b e1 31 d2 41 b9 20 00 48 00 41 89
RSP: 0018:ffff88846fc43ba8 EFLAGS: 00010286
RAX: 7070682e72656d69 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8882984b3888 RDI: ffff88846fc43ba8
RBP: 0000000000000001 R08: 0000000000022e80 R09: ffff88846fc43cd8
R10: ffffffffa0562cf0 R11: 0000000000000007 R12: 0000000000000000
R13: ffff88846fc43cd8 R14: ffff8881bd588000 R15: ffff88846fc43e18
FS:  0000000000000000(0000) GS:ffff88846fc40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000024b3160 CR3: 000000000320a006 CR4: 00000000003607e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 sctp_do_8_2_transport_strike.isra.10+0xd3/0x1a0 [sctp]
 ? sctp_oname+0x20/0x20 [sctp]
 sctp_do_sm+0x15fb/0x1d00 [sctp]
 ? try_to_wake_up+0x226/0x4b0
 ? __update_load_avg_se+0x219/0x2c0
 ? enqueue_entity+0xc4/0x850
 ? enqueue_entity+0x17f/0x850
 ? enqueue_task_fair+0xe5/0x950
 ? __update_load_avg_cfs_rq+0x1e2/0x280
 ? resched_curr+0x20/0xd0
 ? check_preempt_curr+0x4e/0x90
 ? ttwu_do_wakeup.isra.21+0x19/0x170
 sctp_generate_timeout_event+0xa8/0xf0 [sctp]
 ? sctp_generate_t4_rto_event+0x20/0x20 [sctp]
 ? sctp_generate_t4_rto_event+0x20/0x20 [sctp]
 call_timer_fn+0x32/0x170
 expire_timers+0x9d/0x110
 run_timer_softirq+0x8a/0x160
 ? __hrtimer_run_queues+0x156/0x2e0
 __do_softirq+0xaf/0x33e
 irq_exit+0xbf/0xe0
 smp_apic_timer_interrupt+0x7d/0x170
 apic_timer_interrupt+0xf/0x20
 </IRQ>

The exploding code in sctp_assoc_control_transport() is:

        /* Generate and send a SCTP_PEER_ADDR_CHANGE notification                                                                                                                                                  
         * to the user.                                                                                                                                                                                            
         */                                                                                                                                                                                                        
        if (ulp_notify) {                                                                                                                                                                                          
                memset(&addr, 0, sizeof(struct sockaddr_storage));                                                                                                                                                 
                memcpy(&addr, &transport->ipaddr,                                                                                                                                                                  
                       transport->af_specific->sockaddr_len);                                                                                                                                                      
 
where "transport" pointer seems to be freed and re-used, so the dereference
to obtain "af_specific" leads to the dump above. This memset/memcpy pair
has been factored out into sctp_ulpevent_notify_peer_addr_change(), but
this most probably doesn't solve the problem.

According to the stack above, the race seems to be between this code:

enum sctp_disposition sctp_sf_t4_timer_expire(                                                                                                                                                                     
                                        struct net *net,                                                                                                                                                           
                                        const struct sctp_endpoint *ep,                                                                                                                                            
                                        const struct sctp_association *asoc,                                                                                                                                       
                                        const union sctp_subtype type,                                                                                                                                             
                                        void *arg,                                                                                                                                                                 
                                        struct sctp_cmd_seq *commands)                                                                                                                                             
{                                                                                                                                                                                                                  
        struct sctp_chunk *chunk = asoc->addip_last_asconf;                                                                                                                                                        
        struct sctp_transport *transport = chunk->transport;                                                                                                                                                       
                                                                                                                                                                                                                   
        SCTP_INC_STATS(net, SCTP_MIB_T4_RTO_EXPIREDS);                                                                                                                                                             
                                                                                                                                                                                                                   
        /* ADDIP 4.1 B1) Increment the error counters and perform path failure                                                                                                                                     
         * detection on the appropriate destination address as defined in                                                                                                                                          
         * RFC2960 [5] section 8.1 and 8.2.                                                                                                                                                                        
         */                                                                                                                                                                                                        
        if (transport)                                                                                                                                                                                             
                sctp_add_cmd_sf(commands, SCTP_CMD_STRIKE,                                                                                                                                                         
                                SCTP_TRANSPORT(transport));                                                                                                                                                        


and

void sctp_assoc_rm_peer(struct sctp_association *asoc,                                                                                                                                                             
                        struct sctp_transport *peer)                                                                                                                                                               
{                                                                                                                                                                                                                  
...
        /* If we remove the transport an ASCONF was last sent to, set it to                                                                                                                                        
         * NULL.                                                                                                                                                                                                   
         */                                                                                                                                                                                                        
        if (asoc->addip_last_asconf &&                                                                                                                                                                             
            asoc->addip_last_asconf->transport == peer)                                                                                                                                                            
                asoc->addip_last_asconf->transport = NULL;                                                                                                                                                         
                                                                                                                                                                                                                   
...
        asoc->peer.transport_count--;                                                                                                                                                                              
                                                                                                                                                                                                                   
        sctp_transport_free(peer);                                                                                                                                                                                 
}                                                                                                                                                                                                                  

So instead of doing sctp_transport_hold() on the addip_last_asconf->transport,
the code relies on the NULL assignment to be propagated.

As I do not see any memory barrier or lock here, I have several questions:

- Is it possible that sctp_assoc_control_transport() runs in a timer handler
  in parallel to sctp_assoc_rm_peer() running on a different core?
  In this case there would be no guarantee, that NULL assignment will reach
  another core. 

- What was the designed guarantee, that sctp_assoc_control_transport() will see
  the change to asoc->addip_last_asconf->transport = NULL ? 

- What about all the similar NULL assignments in sctp_assoc_rm_peer()?

-- 
Best regards,
Alexander Sverdlin.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Race of sctp_assoc_control_transport() against sctp_assoc_rm_peer() ?
  2020-12-08 20:59 Race of sctp_assoc_control_transport() against sctp_assoc_rm_peer() ? Alexander Sverdlin
@ 2021-05-12 16:08 ` Xin Long
  0 siblings, 0 replies; 2+ messages in thread
From: Xin Long @ 2021-05-12 16:08 UTC (permalink / raw)
  To: Alexander Sverdlin
  Cc: linux-sctp, Vlad Yasevich, Neil Horman, Marcelo Ricardo Leitner,
	Petr Malat

This should be fixed by:

commit 35b4f24415c854cd718ccdf38dbea6297f010aae
Author: Xin Long <lucien.xin@gmail.com>
Date:   Sat May 1 04:02:58 2021 +0800

    sctp: do asoc update earlier in sctp_sf_do_dupcook_a

On Tue, Dec 8, 2020 at 4:12 PM Alexander Sverdlin
<alexander.sverdlin@nokia.com> wrote:
>
> Hello!
>
> I'm trying to understand the crash we experience with Linux v4.19
> (sorry for the ancient codebase, but actually the affected code is largely
> unchanged up to now).
>
> This is hard but reproducible:
>
> general protection fault: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O      4.19.155-g...
> ...
> RIP: 0010:sctp_assoc_control_transport+0x1db/0x290 [sctp]
> Code: 01 00 00 00 00 00 00 bd 01 00 00 00 31 c0 48 89 e7 b9 10 00 00 00 f3 48 ab 48 8b 86 a8 00 00 00 48 89 e7 48 81 c6 88 00 00 00 <48> 63 90 bc 00 00 00 e8 29 61 2b e1 31 d2 41 b9 20 00 48 00 41 89
> RSP: 0018:ffff88846fc43ba8 EFLAGS: 00010286
> RAX: 7070682e72656d69 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000001 RSI: ffff8882984b3888 RDI: ffff88846fc43ba8
> RBP: 0000000000000001 R08: 0000000000022e80 R09: ffff88846fc43cd8
> R10: ffffffffa0562cf0 R11: 0000000000000007 R12: 0000000000000000
> R13: ffff88846fc43cd8 R14: ffff8881bd588000 R15: ffff88846fc43e18
> FS:  0000000000000000(0000) GS:ffff88846fc40000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000024b3160 CR3: 000000000320a006 CR4: 00000000003607e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <IRQ>
>  sctp_do_8_2_transport_strike.isra.10+0xd3/0x1a0 [sctp]
>  ? sctp_oname+0x20/0x20 [sctp]
>  sctp_do_sm+0x15fb/0x1d00 [sctp]
>  ? try_to_wake_up+0x226/0x4b0
>  ? __update_load_avg_se+0x219/0x2c0
>  ? enqueue_entity+0xc4/0x850
>  ? enqueue_entity+0x17f/0x850
>  ? enqueue_task_fair+0xe5/0x950
>  ? __update_load_avg_cfs_rq+0x1e2/0x280
>  ? resched_curr+0x20/0xd0
>  ? check_preempt_curr+0x4e/0x90
>  ? ttwu_do_wakeup.isra.21+0x19/0x170
>  sctp_generate_timeout_event+0xa8/0xf0 [sctp]
>  ? sctp_generate_t4_rto_event+0x20/0x20 [sctp]
>  ? sctp_generate_t4_rto_event+0x20/0x20 [sctp]
>  call_timer_fn+0x32/0x170
>  expire_timers+0x9d/0x110
>  run_timer_softirq+0x8a/0x160
>  ? __hrtimer_run_queues+0x156/0x2e0
>  __do_softirq+0xaf/0x33e
>  irq_exit+0xbf/0xe0
>  smp_apic_timer_interrupt+0x7d/0x170
>  apic_timer_interrupt+0xf/0x20
>  </IRQ>
>
> The exploding code in sctp_assoc_control_transport() is:
>
>         /* Generate and send a SCTP_PEER_ADDR_CHANGE notification
>          * to the user.
>          */
>         if (ulp_notify) {
>                 memset(&addr, 0, sizeof(struct sockaddr_storage));
>                 memcpy(&addr, &transport->ipaddr,
>                        transport->af_specific->sockaddr_len);
>
> where "transport" pointer seems to be freed and re-used, so the dereference
> to obtain "af_specific" leads to the dump above. This memset/memcpy pair
> has been factored out into sctp_ulpevent_notify_peer_addr_change(), but
> this most probably doesn't solve the problem.
>
> According to the stack above, the race seems to be between this code:
>
> enum sctp_disposition sctp_sf_t4_timer_expire(
>                                         struct net *net,
>                                         const struct sctp_endpoint *ep,
>                                         const struct sctp_association *asoc,
>                                         const union sctp_subtype type,
>                                         void *arg,
>                                         struct sctp_cmd_seq *commands)
> {
>         struct sctp_chunk *chunk = asoc->addip_last_asconf;
>         struct sctp_transport *transport = chunk->transport;
>
>         SCTP_INC_STATS(net, SCTP_MIB_T4_RTO_EXPIREDS);
>
>         /* ADDIP 4.1 B1) Increment the error counters and perform path failure
>          * detection on the appropriate destination address as defined in
>          * RFC2960 [5] section 8.1 and 8.2.
>          */
>         if (transport)
>                 sctp_add_cmd_sf(commands, SCTP_CMD_STRIKE,
>                                 SCTP_TRANSPORT(transport));
>
>
> and
>
> void sctp_assoc_rm_peer(struct sctp_association *asoc,
>                         struct sctp_transport *peer)
> {
> ...
>         /* If we remove the transport an ASCONF was last sent to, set it to
>          * NULL.
>          */
>         if (asoc->addip_last_asconf &&
>             asoc->addip_last_asconf->transport == peer)
>                 asoc->addip_last_asconf->transport = NULL;
>
> ...
>         asoc->peer.transport_count--;
>
>         sctp_transport_free(peer);
> }
>
> So instead of doing sctp_transport_hold() on the addip_last_asconf->transport,
> the code relies on the NULL assignment to be propagated.
>
> As I do not see any memory barrier or lock here, I have several questions:
>
> - Is it possible that sctp_assoc_control_transport() runs in a timer handler
>   in parallel to sctp_assoc_rm_peer() running on a different core?
>   In this case there would be no guarantee, that NULL assignment will reach
>   another core.
>
> - What was the designed guarantee, that sctp_assoc_control_transport() will see
>   the change to asoc->addip_last_asconf->transport = NULL ?
>
> - What about all the similar NULL assignments in sctp_assoc_rm_peer()?
>
> --
> Best regards,
> Alexander Sverdlin.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-05-12 17:16 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-08 20:59 Race of sctp_assoc_control_transport() against sctp_assoc_rm_peer() ? Alexander Sverdlin
2021-05-12 16:08 ` Xin Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).