All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482
@ 2017-08-07 18:16 Rao Shoaib
  2017-08-08  9:59 ` Eric Dumazet
  2017-08-08 17:25 ` Yuchung Cheng
  0 siblings, 2 replies; 13+ messages in thread
From: Rao Shoaib @ 2017-08-07 18:16 UTC (permalink / raw)
  To: davem, kuznet; +Cc: netdev

Change from version 0: Rationale behind the change:

The man page for tcp(7) states

when used with the TCP keepalive (SO_KEEPALIVE) option, TCP_USER_TIMEOUT will
override keepalive to  determine  when to close a connection due to keepalive
failure.

This is ambigious at best. user expectation is most likely that the connection
will be reset after TCP_USER_TIMEOUT milliseconds of inactivity.

The code however waits for the keepalive to kick-in (default 2hrs) and than
after one failure resets the conenction. 

What is the rationale for that ? The same effect can be obtained by simply
changing the value of tcp_keep_alive_probes.

Since the TCP_USER_TIMEOUT option was added based on RFC 5482 we need to follow 
the RFC. Which states

4.2 TCP keep-Alives:
   Some TCP implementations, such as those in BSD systems, use a
   different abort policy for TCP keep-alives than for user data.  Thus,
   the TCP keep-alive mechanism might abort a connection that would
   otherwise have survived the transient period without connectivity.
   Therefore, if a connection that enables keep-alives is also using the
   TCP User Timeout Option, then the keep-alive timer MUST be set to a
   value larger than that of the adopted USER TIMEOUT.

This patch enforces the MUST and also dis-associates user timeout from keep
alive.  A man page patch will be submitted separately.

Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
---
 net/ipv4/tcp.c       | 10 ++++++++--
 net/ipv4/tcp_timer.c |  9 +--------
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 71ce33d..f2af44d 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2628,7 +2628,9 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
 		break;
 
 	case TCP_KEEPIDLE:
-		if (val < 1 || val > MAX_TCP_KEEPIDLE)
+		/* Per RFC5482 keepalive_time must be > user_timeout */
+		if (val < 1 || val > MAX_TCP_KEEPIDLE ||
+		    ((val * HZ) <= icsk->icsk_user_timeout))
 			err = -EINVAL;
 		else {
 			tp->keepalive_time = val * HZ;
@@ -2724,8 +2726,12 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
 	case TCP_USER_TIMEOUT:
 		/* Cap the max time in ms TCP will retry or probe the window
 		 * before giving up and aborting (ETIMEDOUT) a connection.
+		 * Per RFC5482 TCP user timeout must be < keepalive_time.
+		 * If the default value changes later -- all bets are off.
 		 */
-		if (val < 0)
+		if (val < 0 || (tp->keepalive_time &&
+				tp->keepalive_time <= msecs_to_jiffies(val)) ||
+		   net->ipv4.sysctl_tcp_keepalive_time <= msecs_to_jiffies(val))
 			err = -EINVAL;
 		else
 			icsk->icsk_user_timeout = msecs_to_jiffies(val);
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index c0feeee..d39fe60 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -664,14 +664,7 @@ static void tcp_keepalive_timer (unsigned long data)
 	elapsed = keepalive_time_elapsed(tp);
 
 	if (elapsed >= keepalive_time_when(tp)) {
-		/* If the TCP_USER_TIMEOUT option is enabled, use that
-		 * to determine when to timeout instead.
-		 */
-		if ((icsk->icsk_user_timeout != 0 &&
-		    elapsed >= icsk->icsk_user_timeout &&
-		    icsk->icsk_probes_out > 0) ||
-		    (icsk->icsk_user_timeout == 0 &&
-		    icsk->icsk_probes_out >= keepalive_probes(tp))) {
+		if (icsk->icsk_probes_out >= keepalive_probes(tp)) {
 			tcp_send_active_reset(sk, GFP_ATOMIC);
 			tcp_write_err(sk);
 			goto out;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482
  2017-08-07 18:16 [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482 Rao Shoaib
@ 2017-08-08  9:59 ` Eric Dumazet
  2017-08-08 17:25 ` Yuchung Cheng
  1 sibling, 0 replies; 13+ messages in thread
From: Eric Dumazet @ 2017-08-08  9:59 UTC (permalink / raw)
  To: Rao Shoaib; +Cc: davem, kuznet, netdev

On Mon, 2017-08-07 at 11:16 -0700, Rao Shoaib wrote:
> Change from version 0: Rationale behind the change:
> 
> The man page for tcp(7) states
> 
> when used with the TCP keepalive (SO_KEEPALIVE) option, TCP_USER_TIMEOUT will
> override keepalive to  determine  when to close a connection due to keepalive
> failure.
> 
> This is ambigious at best. user expectation is most likely that the connection
> will be reset after TCP_USER_TIMEOUT milliseconds of inactivity.
> 
> The code however waits for the keepalive to kick-in (default 2hrs) and than
> after one failure resets the conenction. 
> 
> What is the rationale for that ? The same effect can be obtained by simply
> changing the value of tcp_keep_alive_probes.
> 
> Since the TCP_USER_TIMEOUT option was added based on RFC 5482 we need to follow 
> the RFC. Which states
> 
> 4.2 TCP keep-Alives:
>    Some TCP implementations, such as those in BSD systems, use a
>    different abort policy for TCP keep-alives than for user data.  Thus,
>    the TCP keep-alive mechanism might abort a connection that would
>    otherwise have survived the transient period without connectivity.
>    Therefore, if a connection that enables keep-alives is also using the
>    TCP User Timeout Option, then the keep-alive timer MUST be set to a
>    value larger than that of the adopted USER TIMEOUT.
> 
> This patch enforces the MUST and also dis-associates user timeout from keep
> alive.  A man page patch will be submitted separately.
> 
> Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
> ---
>  net/ipv4/tcp.c       | 10 ++++++++--
>  net/ipv4/tcp_timer.c |  9 +--------
>  2 files changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 71ce33d..f2af44d 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2628,7 +2628,9 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
>  		break;
>  
>  	case TCP_KEEPIDLE:
> -		if (val < 1 || val > MAX_TCP_KEEPIDLE)
> +		/* Per RFC5482 keepalive_time must be > user_timeout */
> +		if (val < 1 || val > MAX_TCP_KEEPIDLE ||
> +		    ((val * HZ) <= icsk->icsk_user_timeout))
>  			err = -EINVAL;
>  		else {
>  			tp->keepalive_time = val * HZ;
> @@ -2724,8 +2726,12 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
>  	case TCP_USER_TIMEOUT:
>  		/* Cap the max time in ms TCP will retry or probe the window
>  		 * before giving up and aborting (ETIMEDOUT) a connection.
> +		 * Per RFC5482 TCP user timeout must be < keepalive_time.
> +		 * If the default value changes later -- all bets are off.
>  		 */
> -		if (val < 0)
> +		if (val < 0 || (tp->keepalive_time &&
> +				tp->keepalive_time <= msecs_to_jiffies(val)) ||
> +		   net->ipv4.sysctl_tcp_keepalive_time <= msecs_to_jiffies(val))


When TCP_USER_TIMEOUT socket option is attempted, maybe keepalive option
was not used.

Yet your new tests assume it is engaged.

It might break some usages.

>  			err = -EINVAL;
>  		else
>  			icsk->icsk_user_timeout = msecs_to_jiffies(val);
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index c0feeee..d39fe60 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -664,14 +664,7 @@ static void tcp_keepalive_timer (unsigned long data)
>  	elapsed = keepalive_time_elapsed(tp);
>  
>  	if (elapsed >= keepalive_time_when(tp)) {
> -		/* If the TCP_USER_TIMEOUT option is enabled, use that
> -		 * to determine when to timeout instead.
> -		 */
> -		if ((icsk->icsk_user_timeout != 0 &&
> -		    elapsed >= icsk->icsk_user_timeout &&
> -		    icsk->icsk_probes_out > 0) ||
> -		    (icsk->icsk_user_timeout == 0 &&
> -		    icsk->icsk_probes_out >= keepalive_probes(tp))) {
> +		if (icsk->icsk_probes_out >= keepalive_probes(tp)) {
>  			tcp_send_active_reset(sk, GFP_ATOMIC);
>  			tcp_write_err(sk);
>  			goto out;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482
  2017-08-07 18:16 [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482 Rao Shoaib
  2017-08-08  9:59 ` Eric Dumazet
@ 2017-08-08 17:25 ` Yuchung Cheng
  2017-08-09 23:52   ` Jerry Chu
  1 sibling, 1 reply; 13+ messages in thread
From: Yuchung Cheng @ 2017-08-08 17:25 UTC (permalink / raw)
  To: Rao Shoaib; +Cc: David Miller, Alexey Kuznetsov, netdev, Jerry Chu

On Mon, Aug 7, 2017 at 11:16 AM, Rao Shoaib <rao.shoaib@oracle.com> wrote:
> Change from version 0: Rationale behind the change:
>
> The man page for tcp(7) states
>
> when used with the TCP keepalive (SO_KEEPALIVE) option, TCP_USER_TIMEOUT will
> override keepalive to  determine  when to close a connection due to keepalive
> failure.
>
> This is ambigious at best. user expectation is most likely that the connection
> will be reset after TCP_USER_TIMEOUT milliseconds of inactivity.
ccing the original author Jerry Chu who can tell more.

>
> The code however waits for the keepalive to kick-in (default 2hrs) and than
> after one failure resets the conenction.
>
> What is the rationale for that ? The same effect can be obtained by simply
> changing the value of tcp_keep_alive_probes.
>
> Since the TCP_USER_TIMEOUT option was added based on RFC 5482 we need to follow
> the RFC. Which states
>
> 4.2 TCP keep-Alives:
>    Some TCP implementations, such as those in BSD systems, use a
>    different abort policy for TCP keep-alives than for user data.  Thus,
>    the TCP keep-alive mechanism might abort a connection that would
>    otherwise have survived the transient period without connectivity.
>    Therefore, if a connection that enables keep-alives is also using the
>    TCP User Timeout Option, then the keep-alive timer MUST be set to a
>    value larger than that of the adopted USER TIMEOUT.
>
> This patch enforces the MUST and also dis-associates user timeout from keep
> alive.  A man page patch will be submitted separately.
>
> Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
> ---
>  net/ipv4/tcp.c       | 10 ++++++++--
>  net/ipv4/tcp_timer.c |  9 +--------
>  2 files changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 71ce33d..f2af44d 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2628,7 +2628,9 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
>                 break;
>
>         case TCP_KEEPIDLE:
> -               if (val < 1 || val > MAX_TCP_KEEPIDLE)
> +               /* Per RFC5482 keepalive_time must be > user_timeout */
> +               if (val < 1 || val > MAX_TCP_KEEPIDLE ||
> +                   ((val * HZ) <= icsk->icsk_user_timeout))
>                         err = -EINVAL;
>                 else {
>                         tp->keepalive_time = val * HZ;
> @@ -2724,8 +2726,12 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
>         case TCP_USER_TIMEOUT:
>                 /* Cap the max time in ms TCP will retry or probe the window
>                  * before giving up and aborting (ETIMEDOUT) a connection.
> +                * Per RFC5482 TCP user timeout must be < keepalive_time.
> +                * If the default value changes later -- all bets are off.
>                  */
> -               if (val < 0)
> +               if (val < 0 || (tp->keepalive_time &&
> +                               tp->keepalive_time <= msecs_to_jiffies(val)) ||
> +                  net->ipv4.sysctl_tcp_keepalive_time <= msecs_to_jiffies(val))
>                         err = -EINVAL;
>                 else
>                         icsk->icsk_user_timeout = msecs_to_jiffies(val);
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index c0feeee..d39fe60 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -664,14 +664,7 @@ static void tcp_keepalive_timer (unsigned long data)
>         elapsed = keepalive_time_elapsed(tp);
>
>         if (elapsed >= keepalive_time_when(tp)) {
> -               /* If the TCP_USER_TIMEOUT option is enabled, use that
> -                * to determine when to timeout instead.
> -                */
> -               if ((icsk->icsk_user_timeout != 0 &&
> -                   elapsed >= icsk->icsk_user_timeout &&
> -                   icsk->icsk_probes_out > 0) ||
> -                   (icsk->icsk_user_timeout == 0 &&
> -                   icsk->icsk_probes_out >= keepalive_probes(tp))) {
> +               if (icsk->icsk_probes_out >= keepalive_probes(tp)) {
>                         tcp_send_active_reset(sk, GFP_ATOMIC);
>                         tcp_write_err(sk);
>                         goto out;
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482
  2017-08-08 17:25 ` Yuchung Cheng
@ 2017-08-09 23:52   ` Jerry Chu
  2017-08-10  0:20     ` Joe Smith
  0 siblings, 1 reply; 13+ messages in thread
From: Jerry Chu @ 2017-08-09 23:52 UTC (permalink / raw)
  To: Yuchung Cheng; +Cc: Rao Shoaib, David Miller, Alexey Kuznetsov, netdev

[try to recover from long lost memory]

On Tue, Aug 8, 2017 at 10:25 AM, Yuchung Cheng <ycheng@google.com> wrote:
> On Mon, Aug 7, 2017 at 11:16 AM, Rao Shoaib <rao.shoaib@oracle.com> wrote:
>> Change from version 0: Rationale behind the change:
>>
>> The man page for tcp(7) states
>>
>> when used with the TCP keepalive (SO_KEEPALIVE) option, TCP_USER_TIMEOUT will
>> override keepalive to  determine  when to close a connection due to keepalive
>> failure.
>>
>> This is ambigious at best. user expectation is most likely that the connection
>> will be reset after TCP_USER_TIMEOUT milliseconds of inactivity.
> ccing the original author Jerry Chu who can tell more.
>

There was a reason for the above otherwise I wouldn't have explicitly
spelled it out in
my commit msg. But unfortunately it was seven years ago and I can't
remember why.
It could range from micro-optimization (saving a syscall() because
this facility was
used by servers handling millions of Android clients) to something more critical
but I can't remember.

>>
>> The code however waits for the keepalive to kick-in (default 2hrs) and than
>> after one failure resets the conenction.
>>
>> What is the rationale for that ? The same effect can be obtained by simply
>> changing the value of tcp_keep_alive_probes.
>>
>> Since the TCP_USER_TIMEOUT option was added based on RFC 5482 we need to follow
>> the RFC. Which states

Well the patch has little to do with RFC5482 other than borrowing the name, and
also conveniently providing a mechanism for RFC5482 apps to program the local
timeout value. As far as I knew back when I worked on the patch, RFC5482 was
under little use (told directly by Lars).

Your proposed change may not be unreasonable but my fear is it may
cause breakage
on apps that depend on "TCP_USER_TIMEOUT will overtake keepalive to determine
when to close a connection due to keepalive failure". What is your
case for "RFC5482
compliance" after all? I know the TCP_USER_TIMEOUT option has been very popular
among apps since its inception.

>>
>> 4.2 TCP keep-Alives:
>>    Some TCP implementations, such as those in BSD systems, use a
>>    different abort policy for TCP keep-alives than for user data.  Thus,
>>    the TCP keep-alive mechanism might abort a connection that would
>>    otherwise have survived the transient period without connectivity.
>>    Therefore, if a connection that enables keep-alives is also using the
>>    TCP User Timeout Option, then the keep-alive timer MUST be set to a
>>    value larger than that of the adopted USER TIMEOUT.
>>
>> This patch enforces the MUST and also dis-associates user timeout from keep
>> alive.  A man page patch will be submitted separately.
>>
>> Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
>> ---
>>  net/ipv4/tcp.c       | 10 ++++++++--
>>  net/ipv4/tcp_timer.c |  9 +--------
>>  2 files changed, 9 insertions(+), 10 deletions(-)
>>
>> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
>> index 71ce33d..f2af44d 100644
>> --- a/net/ipv4/tcp.c
>> +++ b/net/ipv4/tcp.c
>> @@ -2628,7 +2628,9 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
>>                 break;
>>
>>         case TCP_KEEPIDLE:
>> -               if (val < 1 || val > MAX_TCP_KEEPIDLE)
>> +               /* Per RFC5482 keepalive_time must be > user_timeout */
>> +               if (val < 1 || val > MAX_TCP_KEEPIDLE ||
>> +                   ((val * HZ) <= icsk->icsk_user_timeout))
>>                         err = -EINVAL;
>>                 else {
>>                         tp->keepalive_time = val * HZ;
>> @@ -2724,8 +2726,12 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
>>         case TCP_USER_TIMEOUT:
>>                 /* Cap the max time in ms TCP will retry or probe the window
>>                  * before giving up and aborting (ETIMEDOUT) a connection.
>> +                * Per RFC5482 TCP user timeout must be < keepalive_time.
>> +                * If the default value changes later -- all bets are off.
>>                  */
>> -               if (val < 0)
>> +               if (val < 0 || (tp->keepalive_time &&
>> +                               tp->keepalive_time <= msecs_to_jiffies(val)) ||
>> +                  net->ipv4.sysctl_tcp_keepalive_time <= msecs_to_jiffies(val))
>>                         err = -EINVAL;
>>                 else
>>                         icsk->icsk_user_timeout = msecs_to_jiffies(val);
>> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
>> index c0feeee..d39fe60 100644
>> --- a/net/ipv4/tcp_timer.c
>> +++ b/net/ipv4/tcp_timer.c
>> @@ -664,14 +664,7 @@ static void tcp_keepalive_timer (unsigned long data)
>>         elapsed = keepalive_time_elapsed(tp);
>>
>>         if (elapsed >= keepalive_time_when(tp)) {
>> -               /* If the TCP_USER_TIMEOUT option is enabled, use that
>> -                * to determine when to timeout instead.
>> -                */
>> -               if ((icsk->icsk_user_timeout != 0 &&
>> -                   elapsed >= icsk->icsk_user_timeout &&
>> -                   icsk->icsk_probes_out > 0) ||
>> -                   (icsk->icsk_user_timeout == 0 &&
>> -                   icsk->icsk_probes_out >= keepalive_probes(tp))) {
>> +               if (icsk->icsk_probes_out >= keepalive_probes(tp)) {
>>                         tcp_send_active_reset(sk, GFP_ATOMIC);
>>                         tcp_write_err(sk);
>>                         goto out;
>> --
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482
  2017-08-09 23:52   ` Jerry Chu
@ 2017-08-10  0:20     ` Joe Smith
  2017-08-10  0:30       ` David Miller
  2017-08-10  0:31       ` Rao Shoaib
  0 siblings, 2 replies; 13+ messages in thread
From: Joe Smith @ 2017-08-10  0:20 UTC (permalink / raw)
  To: Jerry Chu
  Cc: Yuchung Cheng, Rao Shoaib, David Miller, Alexey Kuznetsov, netdev

On Wed, Aug 9, 2017 at 4:52 PM, Jerry Chu <hkchu@google.com> wrote:
> [try to recover from long lost memory]
>
> On Tue, Aug 8, 2017 at 10:25 AM, Yuchung Cheng <ycheng@google.com> wrote:
>> On Mon, Aug 7, 2017 at 11:16 AM, Rao Shoaib <rao.shoaib@oracle.com> wrote:
>>> Change from version 0: Rationale behind the change:
>>>
>>> The man page for tcp(7) states
>>>
>>> when used with the TCP keepalive (SO_KEEPALIVE) option, TCP_USER_TIMEOUT will
>>> override keepalive to  determine  when to close a connection due to keepalive
>>> failure.
>>>
>>> This is ambigious at best. user expectation is most likely that the connection
>>> will be reset after TCP_USER_TIMEOUT milliseconds of inactivity.
>> ccing the original author Jerry Chu who can tell more.
>>
>
> There was a reason for the above otherwise I wouldn't have explicitly
> spelled it out in
> my commit msg. But unfortunately it was seven years ago and I can't
> remember why.
> It could range from micro-optimization (saving a syscall() because
> this facility was
> used by servers handling millions of Android clients) to something more critical
> but I can't remember.

The issue is that the man page is ambiguous and does not conform to
any standard.
Whether  RFC 5482 is in little use or not that was cited as the basis
of this change and
I want to change the behavior to conform to it as users are confused.

I doubt that saving a syscall is of any benefit when the connection
has been idle for 2hrs. If anything the user expects the keep alive
probes to start after TCP_USER_TIMEOUT of inactivity. In which case
keep alive should be adjusted.

>
>>>
>>> The code however waits for the keepalive to kick-in (default 2hrs) and than
>>> after one failure resets the conenction.
>>>
>>> What is the rationale for that ? The same effect can be obtained by simply
>>> changing the value of tcp_keep_alive_probes.
>>>
>>> Since the TCP_USER_TIMEOUT option was added based on RFC 5482 we need to follow
>>> the RFC. Which states
>
> Well the patch has little to do with RFC5482 other than borrowing the name, and
> also conveniently providing a mechanism for RFC5482 apps to program the local
> timeout value. As far as I knew back when I worked on the patch, RFC5482 was
> under little use (told directly by Lars).
>
> Your proposed change may not be unreasonable but my fear is it may
> cause breakage
> on apps that depend on "TCP_USER_TIMEOUT will overtake keepalive to determine
> when to close a connection due to keepalive failure". What is your
> case for "RFC5482
> compliance" after all? I know the TCP_USER_TIMEOUT option has been very popular
> among apps since its inception.

The only use of TCP_USER_TIMEOUT has been for flushing unacknowledged
data (evident from all the fixes). That behavior is not being touched.

Making Linux conform to standards and behavior that is logical seems
like a good enough reason. Mixing keep alive and TCP_USER_TIMEOUT does
not make any sense. I doubt very much if this change will break
anything but if it does than we need to see why that is needed and
implement a proper fix and document it.

Shoaib

>
>>>
>>> 4.2 TCP keep-Alives:
>>>    Some TCP implementations, such as those in BSD systems, use a
>>>    different abort policy for TCP keep-alives than for user data.  Thus,
>>>    the TCP keep-alive mechanism might abort a connection that would
>>>    otherwise have survived the transient period without connectivity.
>>>    Therefore, if a connection that enables keep-alives is also using the
>>>    TCP User Timeout Option, then the keep-alive timer MUST be set to a
>>>    value larger than that of the adopted USER TIMEOUT.
>>>
>>> This patch enforces the MUST and also dis-associates user timeout from keep
>>> alive.  A man page patch will be submitted separately.
>>>
>>> Signed-off-by: Rao Shoaib <rao.shoaib@oracle.com>
>>> ---
>>>  net/ipv4/tcp.c       | 10 ++++++++--
>>>  net/ipv4/tcp_timer.c |  9 +--------
>>>  2 files changed, 9 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
>>> index 71ce33d..f2af44d 100644
>>> --- a/net/ipv4/tcp.c
>>> +++ b/net/ipv4/tcp.c
>>> @@ -2628,7 +2628,9 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
>>>                 break;
>>>
>>>         case TCP_KEEPIDLE:
>>> -               if (val < 1 || val > MAX_TCP_KEEPIDLE)
>>> +               /* Per RFC5482 keepalive_time must be > user_timeout */
>>> +               if (val < 1 || val > MAX_TCP_KEEPIDLE ||
>>> +                   ((val * HZ) <= icsk->icsk_user_timeout))
>>>                         err = -EINVAL;
>>>                 else {
>>>                         tp->keepalive_time = val * HZ;
>>> @@ -2724,8 +2726,12 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
>>>         case TCP_USER_TIMEOUT:
>>>                 /* Cap the max time in ms TCP will retry or probe the window
>>>                  * before giving up and aborting (ETIMEDOUT) a connection.
>>> +                * Per RFC5482 TCP user timeout must be < keepalive_time.
>>> +                * If the default value changes later -- all bets are off.
>>>                  */
>>> -               if (val < 0)
>>> +               if (val < 0 || (tp->keepalive_time &&
>>> +                               tp->keepalive_time <= msecs_to_jiffies(val)) ||
>>> +                  net->ipv4.sysctl_tcp_keepalive_time <= msecs_to_jiffies(val))
>>>                         err = -EINVAL;
>>>                 else
>>>                         icsk->icsk_user_timeout = msecs_to_jiffies(val);
>>> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
>>> index c0feeee..d39fe60 100644
>>> --- a/net/ipv4/tcp_timer.c
>>> +++ b/net/ipv4/tcp_timer.c
>>> @@ -664,14 +664,7 @@ static void tcp_keepalive_timer (unsigned long data)
>>>         elapsed = keepalive_time_elapsed(tp);
>>>
>>>         if (elapsed >= keepalive_time_when(tp)) {
>>> -               /* If the TCP_USER_TIMEOUT option is enabled, use that
>>> -                * to determine when to timeout instead.
>>> -                */
>>> -               if ((icsk->icsk_user_timeout != 0 &&
>>> -                   elapsed >= icsk->icsk_user_timeout &&
>>> -                   icsk->icsk_probes_out > 0) ||
>>> -                   (icsk->icsk_user_timeout == 0 &&
>>> -                   icsk->icsk_probes_out >= keepalive_probes(tp))) {
>>> +               if (icsk->icsk_probes_out >= keepalive_probes(tp)) {
>>>                         tcp_send_active_reset(sk, GFP_ATOMIC);
>>>                         tcp_write_err(sk);
>>>                         goto out;
>>> --
>>> 2.7.4
>>>



-- 
JS

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482
  2017-08-10  0:20     ` Joe Smith
@ 2017-08-10  0:30       ` David Miller
  2017-08-10  0:47         ` Rao Shoaib
  2017-08-10  0:31       ` Rao Shoaib
  1 sibling, 1 reply; 13+ messages in thread
From: David Miller @ 2017-08-10  0:30 UTC (permalink / raw)
  To: codesoldier1; +Cc: hkchu, ycheng, rao.shoaib, kuznet, netdev

From: Joe Smith <codesoldier1@gmail.com>
Date: Wed, 9 Aug 2017 17:20:32 -0700

> Making Linux conform to standards and behavior that is logical seems
> like a good enough reason.

That's an awesome attitude to have when we're implementing something
new and don't have the facility already.

But when we have something already the only important consideration is
not breaking existing apps which rely on that behavior.

That is much, much, more important than standards compliance.

If users are confused, just fix the documentation.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482
  2017-08-10  0:20     ` Joe Smith
  2017-08-10  0:30       ` David Miller
@ 2017-08-10  0:31       ` Rao Shoaib
  1 sibling, 0 replies; 13+ messages in thread
From: Rao Shoaib @ 2017-08-10  0:31 UTC (permalink / raw)
  To: Joe Smith, Jerry Chu
  Cc: Yuchung Cheng, David Miller, Alexey Kuznetsov, netdev



On 08/09/2017 05:20 PM, Joe Smith wrote:
> On Wed, Aug 9, 2017 at 4:52 PM, Jerry Chu <hkchu@google.com> wrote:
>> [try to recover from long lost memory]
>>
>> On Tue, Aug 8, 2017 at 10:25 AM, Yuchung Cheng <ycheng@google.com> wrote:
>>> On Mon, Aug 7, 2017 at 11:16 AM, Rao Shoaib <rao.shoaib@oracle.com> wrote:
>>>> Change from version 0: Rationale behind the change:
>>>>
>>>> The man page for tcp(7) states
>>>>
>>>> when used with the TCP keepalive (SO_KEEPALIVE) option, TCP_USER_TIMEOUT will
>>>> override keepalive to  determine  when to close a connection due to keepalive
>>>> failure.
>>>>
>>>> This is ambigious at best. user expectation is most likely that the connection
>>>> will be reset after TCP_USER_TIMEOUT milliseconds of inactivity.
>>> ccing the original author Jerry Chu who can tell more.
>>>
>> There was a reason for the above otherwise I wouldn't have explicitly
>> spelled it out in
>> my commit msg. But unfortunately it was seven years ago and I can't
>> remember why.
>> It could range from micro-optimization (saving a syscall() because
>> this facility was
>> used by servers handling millions of Android clients) to something more critical
>> but I can't remember.
> The issue is that the man page is ambiguous and does not conform to
> any standard.
> Whether  RFC 5482 is in little use or not that was cited as the basis
> of this change and
> I want to change the behavior to conform to it as users are confused.
>
> I doubt that saving a syscall is of any benefit when the connection
> has been idle for 2hrs. If anything the user expects the keep alive
> probes to start after TCP_USER_TIMEOUT of inactivity. In which case
> keep alive should be adjusted.
>
>>>> The code however waits for the keepalive to kick-in (default 2hrs) and than
>>>> after one failure resets the conenction.
>>>>
>>>> What is the rationale for that ? The same effect can be obtained by simply
>>>> changing the value of tcp_keep_alive_probes.
>>>>
>>>> Since the TCP_USER_TIMEOUT option was added based on RFC 5482 we need to follow
>>>> the RFC. Which states
>> Well the patch has little to do with RFC5482 other than borrowing the name, and
>> also conveniently providing a mechanism for RFC5482 apps to program the local
>> timeout value. As far as I knew back when I worked on the patch, RFC5482 was
>> under little use (told directly by Lars).
>>
>> Your proposed change may not be unreasonable but my fear is it may
>> cause breakage
>> on apps that depend on "TCP_USER_TIMEOUT will overtake keepalive to determine
>> when to close a connection due to keepalive failure". What is your
>> case for "RFC5482
>> compliance" after all? I know the TCP_USER_TIMEOUT option has been very popular
>> among apps since its inception.
> The only use of TCP_USER_TIMEOUT has been for flushing unacknowledged
> data (evident from all the fixes). That behavior is not being touched.
>
> Making Linux conform to standards and behavior that is logical seems
> like a good enough reason. Mixing keep alive and TCP_USER_TIMEOUT does
> not make any sense. I doubt very much if this change will break
> anything but if it does than we need to see why that is needed and
> implement a proper fix and document it.
>
>
The behavior for the main use case has been previously changed as part 
of bug fixes. This is a very low risk change and makes the code logical 
and clean.

Shoaib

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482
  2017-08-10  0:30       ` David Miller
@ 2017-08-10  0:47         ` Rao Shoaib
  2017-08-10  0:52           ` David Miller
  2017-08-10  3:32           ` Jerry Chu
  0 siblings, 2 replies; 13+ messages in thread
From: Rao Shoaib @ 2017-08-10  0:47 UTC (permalink / raw)
  To: David Miller, codesoldier1; +Cc: hkchu, ycheng, kuznet, netdev



On 08/09/2017 05:30 PM, David Miller wrote:
> From: Joe Smith <codesoldier1@gmail.com>
> Date: Wed, 9 Aug 2017 17:20:32 -0700
>
>> Making Linux conform to standards and behavior that is logical seems
>> like a good enough reason.
> That's an awesome attitude to have when we're implementing something
> new and don't have the facility already.
>
> But when we have something already the only important consideration is
> not breaking existing apps which rely on that behavior.
>
> That is much, much, more important than standards compliance.
>
> If users are confused, just fix the documentation.
David,

If it was just confusion than sure fixing the documentation is fine. 
What if the logic is incorrect, does not conform to the standard that is 
says it is implementing and easy to fix with little or no risk of breakage.

The proposed patch changes a feature that no one uses. It also imposes 
the relation ship between keepalive and timeout values that is required 
by the RFC and make sense.

You are the final authority, if you say we should just fix the 
documentation than that is fine.

Shoaib

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482
  2017-08-10  0:47         ` Rao Shoaib
@ 2017-08-10  0:52           ` David Miller
  2017-08-10  3:32           ` Jerry Chu
  1 sibling, 0 replies; 13+ messages in thread
From: David Miller @ 2017-08-10  0:52 UTC (permalink / raw)
  To: rao.shoaib; +Cc: codesoldier1, hkchu, ycheng, kuznet, netdev

From: Rao Shoaib <rao.shoaib@oracle.com>
Date: Wed, 9 Aug 2017 17:47:57 -0700

> 
> 
> On 08/09/2017 05:30 PM, David Miller wrote:
>> From: Joe Smith <codesoldier1@gmail.com>
>> Date: Wed, 9 Aug 2017 17:20:32 -0700
>>
>>> Making Linux conform to standards and behavior that is logical seems
>>> like a good enough reason.
>> That's an awesome attitude to have when we're implementing something
>> new and don't have the facility already.
>>
>> But when we have something already the only important consideration is
>> not breaking existing apps which rely on that behavior.
>>
>> That is much, much, more important than standards compliance.
>>
>> If users are confused, just fix the documentation.
> David,
> 
> If it was just confusion than sure fixing the documentation is
> fine. What if the logic is incorrect, does not conform to the standard
> that is says it is implementing and easy to fix with little or no risk
> of breakage.
> 
> The proposed patch changes a feature that no one uses. It also imposes
> the relation ship between keepalive and timeout values that is
> required by the RFC and make sense.
> 
> You are the final authority, if you say we should just fix the
> documentation than that is fine.

I want to hear more about what hkchu and ycheng have to say about
this.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482
  2017-08-10  0:47         ` Rao Shoaib
  2017-08-10  0:52           ` David Miller
@ 2017-08-10  3:32           ` Jerry Chu
  2017-08-10  4:59             ` Jerry Chu
  1 sibling, 1 reply; 13+ messages in thread
From: Jerry Chu @ 2017-08-10  3:32 UTC (permalink / raw)
  To: Rao Shoaib
  Cc: David Miller, codesoldier1, Yuchung Cheng, Alexey Kuznetsov, netdev

On Wed, Aug 9, 2017 at 5:47 PM, Rao Shoaib <rao.shoaib@oracle.com> wrote:
>
>
> On 08/09/2017 05:30 PM, David Miller wrote:
>>
>> From: Joe Smith <codesoldier1@gmail.com>
>> Date: Wed, 9 Aug 2017 17:20:32 -0700
>>
>>> Making Linux conform to standards and behavior that is logical seems
>>> like a good enough reason.
>>
>> That's an awesome attitude to have when we're implementing something
>> new and don't have the facility already.
>>
>> But when we have something already the only important consideration is
>> not breaking existing apps which rely on that behavior.
>>
>> That is much, much, more important than standards compliance.
>>
>> If users are confused, just fix the documentation.
>
> David,
>
> If it was just confusion than sure fixing the documentation is fine. What if
> the logic is incorrect, does not conform to the standard that is says it is

Not sure what part of logic is "incorrect" when it was a homegrown Linux API
with no need to conform with any "standard"? Note that the new API was invented
7 years ago not out of need for RFC5482. In fact I initially call the option
TCP_FAILFAST and did not even know the existence of RFC5482 until someone
around the same time proposed a UTO option specifically for RFC5482 and I
thought the two can be combined. (This is roughly the memory I can
recollect so far.)

So you see my focus back then was to devise a "failfast" option whereas RFC5482
was meant for a "failslow" case. I think that explains why I let the
option override
keepalive so a TCP connection can "fail fast" while RFC5482 4.2 tries to prevent
keepalive failure ahead of UTO, favoring "fail slow".

If we start from a clean slate then perhaps one can argue the semantic
either way
but we do not have a clean slate. For that I still slightly favor not
changing the code
because the risk of breakage is definitely non-zero and the issue you're having
seem to be only related to documentation.

Jerry

> implementing and easy to fix with little or no risk of breakage.
>
> The proposed patch changes a feature that no one uses. It also imposes the
> relation ship between keepalive and timeout values that is required by the
> RFC and make sense.
>
> You are the final authority, if you say we should just fix the documentation
> than that is fine.
>
> Shoaib
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482
  2017-08-10  3:32           ` Jerry Chu
@ 2017-08-10  4:59             ` Jerry Chu
  2017-08-10 21:05               ` Rao Shoaib
  0 siblings, 1 reply; 13+ messages in thread
From: Jerry Chu @ 2017-08-10  4:59 UTC (permalink / raw)
  To: Rao Shoaib
  Cc: David Miller, codesoldier1, Yuchung Cheng, Alexey Kuznetsov, netdev

On Wed, Aug 9, 2017 at 8:32 PM, Jerry Chu <hkchu@google.com> wrote:
> On Wed, Aug 9, 2017 at 5:47 PM, Rao Shoaib <rao.shoaib@oracle.com> wrote:
>>
>>
>> On 08/09/2017 05:30 PM, David Miller wrote:
>>>
>>> From: Joe Smith <codesoldier1@gmail.com>
>>> Date: Wed, 9 Aug 2017 17:20:32 -0700
>>>
>>>> Making Linux conform to standards and behavior that is logical seems
>>>> like a good enough reason.
>>>
>>> That's an awesome attitude to have when we're implementing something
>>> new and don't have the facility already.
>>>
>>> But when we have something already the only important consideration is
>>> not breaking existing apps which rely on that behavior.
>>>
>>> That is much, much, more important than standards compliance.
>>>
>>> If users are confused, just fix the documentation.
>>
>> David,
>>
>> If it was just confusion than sure fixing the documentation is fine. What if
>> the logic is incorrect, does not conform to the standard that is says it is
>
> Not sure what part of logic is "incorrect" when it was a homegrown Linux API
> with no need to conform with any "standard"? Note that the new API was invented
> 7 years ago not out of need for RFC5482. In fact I initially call the option
> TCP_FAILFAST and did not even know the existence of RFC5482 until someone
> around the same time proposed a UTO option specifically for RFC5482 and I
> thought the two can be combined. (This is roughly the memory I can
> recollect so far.)
>
> So you see my focus back then was to devise a "failfast" option whereas RFC5482
> was meant for a "failslow" case. I think that explains why I let the
> option override
> keepalive so a TCP connection can "fail fast" while RFC5482 4.2 tries to prevent
> keepalive failure ahead of UTO, favoring "fail slow".
>
> If we start from a clean slate then perhaps one can argue the semantic
> either way
> but we do not have a clean slate. For that I still slightly favor not
> changing the code
> because the risk of breakage is definitely non-zero and the issue you're having
> seem to be only related to documentation.

One more thing - the proposed patch compares TCP_KEEPIDLE against
TCP_USER_TIMEOUT. But I don't think TCP_KEEPIDLE is what the
"keep-alive
timer" referred to in RFC5482. Linux keepalive implementation seems to use #
of retries (TCP_KEEPCNT) rather than time duration (keep-alive time) to
determine when to quit. If that is the case then your proposed change is not
fully "compliant" either and the best is probably just don't change.

>
> Jerry
>
>> implementing and easy to fix with little or no risk of breakage.
>>
>> The proposed patch changes a feature that no one uses. It also imposes the
>> relation ship between keepalive and timeout values that is required by the
>> RFC and make sense.
>>
>> You are the final authority, if you say we should just fix the documentation
>> than that is fine.
>>
>> Shoaib
>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482
  2017-08-10  4:59             ` Jerry Chu
@ 2017-08-10 21:05               ` Rao Shoaib
  2017-08-11  2:32                 ` Jerry Chu
  0 siblings, 1 reply; 13+ messages in thread
From: Rao Shoaib @ 2017-08-10 21:05 UTC (permalink / raw)
  To: Jerry Chu
  Cc: David Miller, codesoldier1, Yuchung Cheng, Alexey Kuznetsov, netdev



On 08/09/2017 09:59 PM, Jerry Chu wrote:
> On Wed, Aug 9, 2017 at 8:32 PM, Jerry Chu <hkchu@google.com> wrote:
>> On Wed, Aug 9, 2017 at 5:47 PM, Rao Shoaib <rao.shoaib@oracle.com> wrote:
>>>
>>> On 08/09/2017 05:30 PM, David Miller wrote:
>>>> From: Joe Smith <codesoldier1@gmail.com>
>>>> Date: Wed, 9 Aug 2017 17:20:32 -0700
>>>>
>>>>> Making Linux conform to standards and behavior that is logical seems
>>>>> like a good enough reason.
>>>> That's an awesome attitude to have when we're implementing something
>>>> new and don't have the facility already.
>>>>
>>>> But when we have something already the only important consideration is
>>>> not breaking existing apps which rely on that behavior.
>>>>
>>>> That is much, much, more important than standards compliance.
>>>>
>>>> If users are confused, just fix the documentation.
>>> David,
>>>
>>> If it was just confusion than sure fixing the documentation is fine. What if
>>> the logic is incorrect, does not conform to the standard that is says it is
>> Not sure what part of logic is "incorrect" when it was a homegrown Linux API
>> with no need to conform with any "standard"? Note that the new API was invented
>> 7 years ago not out of need for RFC5482. In fact I initially call the option
>> TCP_FAILFAST and did not even know the existence of RFC5482 until someone
>> around the same time proposed a UTO option specifically for RFC5482 and I
>> thought the two can be combined. (This is roughly the memory I can
>> recollect so far.)
>>
>> So you see my focus back then was to devise a "failfast" option whereas RFC5482
>> was meant for a "failslow" case. I think that explains why I let the
>> option override
>> keepalive so a TCP connection can "fail fast" while RFC5482 4.2 tries to prevent
>> keepalive failure ahead of UTO, favoring "fail slow".
>>
>> If we start from a clean slate then perhaps one can argue the semantic
>> either way
>> but we do not have a clean slate. For that I still slightly favor not
>> changing the code
>> because the risk of breakage is definitely non-zero and the issue you're having
>> seem to be only related to documentation.
We all make mistakes and over look things, that seems to be the case 
here. If this was so important than I am sure there was a use case. None 
has been presented. Without a use case I do not understand why we have 
to live with broken logic when we have a chance to fix it and make the 
code better.

If this change does break something (very very unlikely) we will 
understand the use case and provide an appropriate solution.

> One more thing - the proposed patch compares TCP_KEEPIDLE against
> TCP_USER_TIMEOUT. But I don't think TCP_KEEPIDLE is what the
> "keep-alive
> timer" referred to in RFC5482. Linux keepalive implementation seems to use #
> of retries (TCP_KEEPCNT) rather than time duration (keep-alive time) to
> determine when to quit. If that is the case then your proposed change is not
> fully "compliant" either and the best is probably just don't change.
Did you look at the patch and what it changes ?

Take a look at the TCP_KEEPIDLE socket option and see what it does or 
just look at the man page of tcp(7)

TCP_KEEPIDLE (since Linux 2.4) The time (in seconds) the connection 
needs to remain idle before TCP starts sending keepalive probes, if the 
socket option SO_KEEPALIVE has been set on this socket. This option 
should not be used in code intended to be portable.

Shoaib.
>
>> Jerry
>>
>>> implementing and easy to fix with little or no risk of breakage.
>>>
>>> The proposed patch changes a feature that no one uses. It also imposes the
>>> relation ship between keepalive and timeout values that is required by the
>>> RFC and make sense.
>>>
>>> You are the final authority, if you say we should just fix the documentation
>>> than that is fine.
>>>
>>> Shoaib
>>>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482
  2017-08-10 21:05               ` Rao Shoaib
@ 2017-08-11  2:32                 ` Jerry Chu
  0 siblings, 0 replies; 13+ messages in thread
From: Jerry Chu @ 2017-08-11  2:32 UTC (permalink / raw)
  To: Rao Shoaib
  Cc: David Miller, codesoldier1, Yuchung Cheng, Alexey Kuznetsov, netdev

On Thu, Aug 10, 2017 at 2:05 PM, Rao Shoaib <rao.shoaib@oracle.com> wrote:
>
>
> On 08/09/2017 09:59 PM, Jerry Chu wrote:
>>
>> On Wed, Aug 9, 2017 at 8:32 PM, Jerry Chu <hkchu@google.com> wrote:
>>>
>>> On Wed, Aug 9, 2017 at 5:47 PM, Rao Shoaib <rao.shoaib@oracle.com> wrote:
>>>>
>>>>
>>>> On 08/09/2017 05:30 PM, David Miller wrote:
>>>>>
>>>>> From: Joe Smith <codesoldier1@gmail.com>
>>>>> Date: Wed, 9 Aug 2017 17:20:32 -0700
>>>>>
>>>>>> Making Linux conform to standards and behavior that is logical seems
>>>>>> like a good enough reason.
>>>>>
>>>>> That's an awesome attitude to have when we're implementing something
>>>>> new and don't have the facility already.
>>>>>
>>>>> But when we have something already the only important consideration is
>>>>> not breaking existing apps which rely on that behavior.
>>>>>
>>>>> That is much, much, more important than standards compliance.
>>>>>
>>>>> If users are confused, just fix the documentation.
>>>>
>>>> David,
>>>>
>>>> If it was just confusion than sure fixing the documentation is fine.
>>>> What if
>>>> the logic is incorrect, does not conform to the standard that is says it
>>>> is
>>>
>>> Not sure what part of logic is "incorrect" when it was a homegrown Linux
>>> API
>>> with no need to conform with any "standard"? Note that the new API was
>>> invented
>>> 7 years ago not out of need for RFC5482. In fact I initially call the
>>> option
>>> TCP_FAILFAST and did not even know the existence of RFC5482 until someone
>>> around the same time proposed a UTO option specifically for RFC5482 and I
>>> thought the two can be combined. (This is roughly the memory I can
>>> recollect so far.)
>>>
>>> So you see my focus back then was to devise a "failfast" option whereas
>>> RFC5482
>>> was meant for a "failslow" case. I think that explains why I let the
>>> option override
>>> keepalive so a TCP connection can "fail fast" while RFC5482 4.2 tries to
>>> prevent
>>> keepalive failure ahead of UTO, favoring "fail slow".
>>>
>>> If we start from a clean slate then perhaps one can argue the semantic
>>> either way
>>> but we do not have a clean slate. For that I still slightly favor not
>>> changing the code
>>> because the risk of breakage is definitely non-zero and the issue you're
>>> having
>>> seem to be only related to documentation.
>
> We all make mistakes and over look things, that seems to be the case here.

First of all, I do not understand what "mistake" has been committed? Like I
said, the API was not invented for RFC5482 specifically. It was really meant for
a "fail fast" case that was requested of me fir our internal use case
for a product
that back then had only 100 million users (but has mostly like grown to many
billions now).

> If this was so important than I am sure there was a use case. None has been
> presented. Without a use case I do not understand why we have to live with

I don't understand this one either. If you want a use case, you can
ask politely,
no need to flame or fume. Also why would I waste my energy on something
no one would use, then even more energy arguing now for maintaining its
semantic?

This API was requested of me by our Android group and has since been
deployed on all our Android servers. I initially pushed back and asked them to
just program a short KEEPALIVE time for all the Android connections but they
can't use it because it would wake up clients too often, unnecessarily
waste a lot
of battery power.

Its uses inside Google have proliferated since. Most recently (a few months ago)
I got an email from our Nest division saying they hit a bug related to
TCP_USER_TIMEOUT and even pointed me to
https://bugzilla.redhat.com/show_bug.cgi?id=1189241 (but i haven't got the time
to look into it and in the meantime they found a workaround)

So what's your use case then that really requires changing the implementation?

> broken logic when we have a chance to fix it and make the code better.
>
> If this change does break something (very very unlikely) we will understand
> the use case and provide an appropriate solution.
>
>> One more thing - the proposed patch compares TCP_KEEPIDLE against
>> TCP_USER_TIMEOUT. But I don't think TCP_KEEPIDLE is what the
>> "keep-alive
>> timer" referred to in RFC5482. Linux keepalive implementation seems to use
>> #
>> of retries (TCP_KEEPCNT) rather than time duration (keep-alive time) to
>> determine when to quit. If that is the case then your proposed change is
>> not
>> fully "compliant" either and the best is probably just don't change.
>
> Did you look at the patch and what it changes ?
>
> Take a look at the TCP_KEEPIDLE socket option and see what it does or just
> look at the man page of tcp(7)
>
> TCP_KEEPIDLE (since Linux 2.4) The time (in seconds) the connection needs to
> remain idle before TCP starts sending keepalive probes, if the socket option
> SO_KEEPALIVE has been set on this socket. This option should not be used in
> code intended to be portable.

Yes precisely my point above - the intent of section 4.2 in RFC5482, if i guess
correctly, is to avoid the case in some implementations where keep-alive code
uses a different timer than the regular retrans timer where TCP_USER_TIMEOUT
logic is allied, hence not subject to TCP_USER_TIMEOUT. E.g., if the
app programs
TCP_USER_TIMEOUT to be one hour, but a keepalive timeout to be 10 minutes,
RFC5482 does not want a connection to be aborted after 10 minutes due to
keepalive probe failure. This can be readily translated in some OSes if the
keepalive API is time/duration based. E.g., one can use the same
TCP_USER_TIMEOUT value to program the keepalive API.

But Linux's keepalive API has a TCP_KEEPIDLE period plus TCP_KEEPCNT of
retries plus TCP_KEEPINTVL. So depending on TCP_KEEPCNT and
TCP_KEEPINTVL, the total duration for keepalive may be >> TCP_KEEPIDLE
hence my comment above.

Again my main concern is backward compatibility. Our existing use cases most
likely have TCP_USER_TIMEOUT value < TCP_KEEPIDLE value so the risk
is mainly in removing the icsk_user_timeout check inside tcp_keepalive_timer().

Jerry

>
> Shoaib.
>
>>
>>> Jerry
>>>
>>>> implementing and easy to fix with little or no risk of breakage.
>>>>
>>>> The proposed patch changes a feature that no one uses. It also imposes
>>>> the
>>>> relation ship between keepalive and timeout values that is required by
>>>> the
>>>> RFC and make sense.
>>>>
>>>> You are the final authority, if you say we should just fix the
>>>> documentation
>>>> than that is fine.
>>>>
>>>> Shoaib
>>>>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-08-18 16:19 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-07 18:16 [PATCH v1 net] TCP_USER_TIMEOUT and tcp_keepalive should conform to RFC5482 Rao Shoaib
2017-08-08  9:59 ` Eric Dumazet
2017-08-08 17:25 ` Yuchung Cheng
2017-08-09 23:52   ` Jerry Chu
2017-08-10  0:20     ` Joe Smith
2017-08-10  0:30       ` David Miller
2017-08-10  0:47         ` Rao Shoaib
2017-08-10  0:52           ` David Miller
2017-08-10  3:32           ` Jerry Chu
2017-08-10  4:59             ` Jerry Chu
2017-08-10 21:05               ` Rao Shoaib
2017-08-11  2:32                 ` Jerry Chu
2017-08-10  0:31       ` Rao Shoaib

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.