linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2] tcp: fix condition for increasing pingpong count
@ 2022-07-20  7:24 LemmyHuang
  2022-07-20 18:49 ` Neal Cardwell
  0 siblings, 1 reply; 3+ messages in thread
From: LemmyHuang @ 2022-07-20  7:24 UTC (permalink / raw)
  To: edumazet, davem, dsahern, kuba, pabeni; +Cc: netdev, linux-kernel, LemmyHuang

When CONFIG_HZ defaults to 1000Hz and the network transmission time is
less than 1ms, lsndtime and lrcvtime are likely to be equal, which will
lead to hundreds of interactions before entering pingpong mode.

Fixes: 4a41f453bedf ("tcp: change pingpong threshold to 3")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: LemmyHuang <hlm3280@163.com>
---
v2:
  * Use !after() wrapping the values. (Jakub Kicinski)

v1: https://lore.kernel.org/netdev/20220719130136.11907-1-hlm3280@163.com/
---
 net/ipv4/tcp_output.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 858a15cc2..c1c95dc40 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -172,7 +172,7 @@ static void tcp_event_data_sent(struct tcp_sock *tp,
 	 * and it is a reply for ato after last received packet,
 	 * increase pingpong count.
 	 */
-	if (before(tp->lsndtime, icsk->icsk_ack.lrcvtime) &&
+	if (!after(tp->lsndtime, icsk->icsk_ack.lrcvtime) &&
 	    (u32)(now - icsk->icsk_ack.lrcvtime) < icsk->icsk_ack.ato)
 		inet_csk_inc_pingpong_cnt(sk);
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next v2] tcp: fix condition for increasing pingpong count
  2022-07-20  7:24 [PATCH net-next v2] tcp: fix condition for increasing pingpong count LemmyHuang
@ 2022-07-20 18:49 ` Neal Cardwell
  2022-07-21  1:47   ` LemmyHuang
  0 siblings, 1 reply; 3+ messages in thread
From: Neal Cardwell @ 2022-07-20 18:49 UTC (permalink / raw)
  To: LemmyHuang
  Cc: edumazet, davem, dsahern, kuba, pabeni, netdev, linux-kernel,
	Wei Wang, Yuchung Cheng, Soheil Hassas Yeganeh

On Wed, Jul 20, 2022 at 3:25 AM LemmyHuang <hlm3280@163.com> wrote:
>
> When CONFIG_HZ defaults to 1000Hz and the network transmission time is
> less than 1ms, lsndtime and lrcvtime are likely to be equal, which will
> lead to hundreds of interactions before entering pingpong mode.
>
> Fixes: 4a41f453bedf ("tcp: change pingpong threshold to 3")
> Suggested-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: LemmyHuang <hlm3280@163.com>
> ---
> v2:
>   * Use !after() wrapping the values. (Jakub Kicinski)
>
> v1: https://lore.kernel.org/netdev/20220719130136.11907-1-hlm3280@163.com/
> ---
>  net/ipv4/tcp_output.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 858a15cc2..c1c95dc40 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -172,7 +172,7 @@ static void tcp_event_data_sent(struct tcp_sock *tp,
>          * and it is a reply for ato after last received packet,
>          * increase pingpong count.
>          */
> -       if (before(tp->lsndtime, icsk->icsk_ack.lrcvtime) &&
> +       if (!after(tp->lsndtime, icsk->icsk_ack.lrcvtime) &&
>             (u32)(now - icsk->icsk_ack.lrcvtime) < icsk->icsk_ack.ato)
>                 inet_csk_inc_pingpong_cnt(sk);
>
> --

Thanks for pointing out this problem!

AFAICT this patch would result in incorrect behavior.

With this patch, we could have cases where tp->lsndtime ==
icsk->icsk_ack.lrcvtime and (u32)(now - icsk->icsk_ack.lrcvtime) <
icsk->icsk_ack.ato and yet we do not really have a ping-pong exchange.

For example, with this patch we could have:

T1: jiffies=J1; host B receives RPC request from host A
T2: jiffies=J1; host B sends first RPC response data packet to host A;
      -> calls inet_csk_inc_pingpong_cnt()
T3: jiffies=J1; host B sends second RPC response data packet to host A;
      -> calls inet_csk_inc_pingpong_cnt()

In this scenario there is only one ping-pong exchange but the code
calls inet_csk_inc_pingpong_cnt() twice.

So I'm hoping we can come up with a better fix.

A simpler approach might be to simplify the model and go back to
having a single ping-pong interaction cause delayed ACKs to be enabled
on a connection endpoint. Our team has been seeing good results for a
while with the simpler approach. What do folks think?


neal

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net-next v2] tcp: fix condition for increasing pingpong count
  2022-07-20 18:49 ` Neal Cardwell
@ 2022-07-21  1:47   ` LemmyHuang
  0 siblings, 0 replies; 3+ messages in thread
From: LemmyHuang @ 2022-07-21  1:47 UTC (permalink / raw)
  To: ncardwell
  Cc: davem, dsahern, edumazet, hlm3280, kuba, linux-kernel, netdev,
	pabeni, soheil, weiwan, ycheng

At 2022-07-21 02:49:35, "Neal Cardwell" <ncardwell@google.com> wrote:
> On Wed, Jul 20, 2022 at 3:25 AM LemmyHuang <hlm3280@163.com> wrote:
>>
>> When CONFIG_HZ defaults to 1000Hz and the network transmission time is
>> less than 1ms, lsndtime and lrcvtime are likely to be equal, which will
>> lead to hundreds of interactions before entering pingpong mode.
>>
>> Fixes: 4a41f453bedf ("tcp: change pingpong threshold to 3")
>> Suggested-by: Jakub Kicinski <kuba@kernel.org>
>> Signed-off-by: LemmyHuang <hlm3280@163.com>
>> ---
>> v2:
>>   * Use !after() wrapping the values. (Jakub Kicinski)
>>
>> v1: https://lore.kernel.org/netdev/20220719130136.11907-1-hlm3280@163.com/
>> ---
>>  net/ipv4/tcp_output.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
>> index 858a15cc2..c1c95dc40 100644
>> --- a/net/ipv4/tcp_output.c
>> +++ b/net/ipv4/tcp_output.c
>> @@ -172,7 +172,7 @@ static void tcp_event_data_sent(struct tcp_sock *tp,
>>          * and it is a reply for ato after last received packet,
>>          * increase pingpong count.
>>          */
>> -       if (before(tp->lsndtime, icsk->icsk_ack.lrcvtime) &&
>> +       if (!after(tp->lsndtime, icsk->icsk_ack.lrcvtime) &&
>>             (u32)(now - icsk->icsk_ack.lrcvtime) < icsk->icsk_ack.ato)
>>                 inet_csk_inc_pingpong_cnt(sk);
>>
>> --
>
> Thanks for pointing out this problem!
> 
> AFAICT this patch would result in incorrect behavior.
> 
> With this patch, we could have cases where tp->lsndtime ==
> icsk->icsk_ack.lrcvtime and (u32)(now - icsk->icsk_ack.lrcvtime) <
> icsk->icsk_ack.ato and yet we do not really have a ping-pong exchange.
> 
> For example, with this patch we could have:
> 
> T1: jiffies=J1; host B receives RPC request from host A
> T2: jiffies=J1; host B sends first RPC response data packet to host A;
>       -> calls inet_csk_inc_pingpong_cnt()
> T3: jiffies=J1; host B sends second RPC response data packet to host A;
>       -> calls inet_csk_inc_pingpong_cnt()
> 
> In this scenario there is only one ping-pong exchange but the code
> calls inet_csk_inc_pingpong_cnt() twice.
> 
> So I'm hoping we can come up with a better fix.
> 
> A simpler approach might be to simplify the model and go back to
> having a single ping-pong interaction cause delayed ACKs to be enabled
> on a connection endpoint. Our team has been seeing good results for a
> while with the simpler approach. What do folks think?
> 
> 
> neal

It seems better to go back.

Look at this revert patch:
https://lore.kernel.org/netdev/20220720233156.295074-1-weiwan@google.com/


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-07-21  1:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-20  7:24 [PATCH net-next v2] tcp: fix condition for increasing pingpong count LemmyHuang
2022-07-20 18:49 ` Neal Cardwell
2022-07-21  1:47   ` LemmyHuang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).