All of lore.kernel.org
 help / color / mirror / Atom feed
* ip_vs_conn_expire_now may cause timer callback runs on two CPUs for a same session
@ 2016-10-02 15:59 HePeng
  2016-10-02 18:30 ` Julian Anastasov
  0 siblings, 1 reply; 3+ messages in thread
From: HePeng @ 2016-10-02 15:59 UTC (permalink / raw)
  To: lvs-devel

Hi, 

I am a newbie to IPVS.
I read the code of ipvs in 3.10 kernel, and think the 
the implementation of *ip_vs_expire_now* may cause 
timer callback runs on two CPUs for a same session.



CPU 0                         CPU 1                   CPU2



                      a timer is detached 
                      from lists, and the 
                      callback fn is going 
                      to be called.

                                               a packet belongs 
                                               to the same session
                                               is processed by this CPU 
                                               and the timer is re-activated on
                                               this CPU. Then, the ref of *cp* is released.

ip_vs_conn_expire_now
is called on this 
session, which finds 
a pending timer, and 
then *mod_timer_pending*
will change the timer
to expire immediately.
read_unlock allows 
preemption again.

the timer expires and 
callback runs.         call back fn runs
                       

Am I right? This seems break the rule that *ip_vs_conn_expire* should only 
runs on one CPU at time per conn.
see http://oss.sgi.com/archives/netdev/2003-11/msg00763.html




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ip_vs_conn_expire_now may cause timer callback runs on two CPUs for a same session
  2016-10-02 15:59 ip_vs_conn_expire_now may cause timer callback runs on two CPUs for a same session HePeng
@ 2016-10-02 18:30 ` Julian Anastasov
       [not found]   ` <1E75A59A-CDE5-42F1-B914-96C4BD2E0916@icloud.com>
  0 siblings, 1 reply; 3+ messages in thread
From: Julian Anastasov @ 2016-10-02 18:30 UTC (permalink / raw)
  To: HePeng; +Cc: lvs-devel


	Hello,

On Sun, 2 Oct 2016, HePeng wrote:

> Hi, 
> 
> I am a newbie to IPVS.
> I read the code of ipvs in 3.10 kernel, and think the 
> the implementation of *ip_vs_expire_now* may cause 
> timer callback runs on two CPUs for a same session.

	IIRC, timers can not be scheduled on multiple
CPUs at the same time. They can be migrated to other
CPUs but only if callback is not running. __mod_timer()
checks if the callback is running (base->running_timer).
If so, it will schedule the timer on the same CPU where the
callback is currently running. So, the callback can be
called twice but not in parallel. Then, when ip_vs_conn_expire()
is called it will call del_timer() to catch such situations.
It is specified in the comments.

> CPU 0                         CPU 1                   CPU2
> 
> 
> 
>                       a timer is detached 
>                       from lists, and the 
>                       callback fn is going 
>                       to be called.
> 
>                                                a packet belongs 
>                                                to the same session
>                                                is processed by this CPU 
>                                                and the timer is re-activated on
>                                                this CPU. Then, the ref of *cp* is released.
> 
> ip_vs_conn_expire_now
> is called on this 
> session, which finds 
> a pending timer, and 
> then *mod_timer_pending*
> will change the timer
> to expire immediately.
> read_unlock allows 
> preemption again.
> 
> the timer expires and 
> callback runs.         call back fn runs
>                        
> 
> Am I right? This seems break the rule that *ip_vs_conn_expire* should only 
> runs on one CPU at time per conn.

	This is guaranteed by the timers. But it is our job
to not start/leave the timer while final callback frees the conn.

> see http://oss.sgi.com/archives/netdev/2003-11/msg00763.html

	Without checking the code from 2003, more likely
the fix above stops calling mod_timer after the final del_timer
to prevent starting new callback. And this goal is what is
later specified in the comments in 3.10 and latest kernels.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ip_vs_conn_expire_now may cause timer callback runs on two CPUs for a same session
       [not found]   ` <1E75A59A-CDE5-42F1-B914-96C4BD2E0916@icloud.com>
@ 2016-10-03  7:49     ` Julian Anastasov
  0 siblings, 0 replies; 3+ messages in thread
From: Julian Anastasov @ 2016-10-03  7:49 UTC (permalink / raw)
  To: HePeng; +Cc: lvs-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 919 bytes --]


	Hello,

On Mon, 3 Oct 2016, HePeng wrote:

> Hello Julian, 
> Thank you very much for the reply.
> 
> I guess this is the same reason that 
> instead of using *__ip_vs_conn_get*, 
> you use *atomic_inc* to the ref of 
> *cp* in *ip_vs_conn_expire* when fails to unlink *cp*, 
> since there is no callback is running on other CPUs, the 
> ref must be large than 0. That is also 
> one of my questions when I read the code. 

	Yes, ip_vs_conn_unlink() can see refcnt=2 or more
in which case it does not change refcnt (1->0) and conn remains
hashed. But we have to start the timer again, so the pair
atomic_inc+__ip_vs_conn_put_timer is used. It looks safe
to call __ip_vs_conn_get, atomic_inc_not_zero will work
in this case because while in timer callback refcnt can not
go below 1, even if other CPUs call ip_vs_conn_unhash+
ip_vs_conn_hash. But atomic_inc is better.

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-10-03  7:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-02 15:59 ip_vs_conn_expire_now may cause timer callback runs on two CPUs for a same session HePeng
2016-10-02 18:30 ` Julian Anastasov
     [not found]   ` <1E75A59A-CDE5-42F1-B914-96C4BD2E0916@icloud.com>
2016-10-03  7:49     ` Julian Anastasov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.