From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 245ECC433E9 for ; Mon, 11 Jan 2021 14:59:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E021A225AC for ; Mon, 11 Jan 2021 14:59:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732608AbhAKO7c (ORCPT ); Mon, 11 Jan 2021 09:59:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44004 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729285AbhAKO7b (ORCPT ); Mon, 11 Jan 2021 09:59:31 -0500 Received: from mail-ua1-x933.google.com (mail-ua1-x933.google.com [IPv6:2607:f8b0:4864:20::933]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F07DC061786 for ; Mon, 11 Jan 2021 06:58:51 -0800 (PST) Received: by mail-ua1-x933.google.com with SMTP id k47so6035346uad.1 for ; Mon, 11 Jan 2021 06:58:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=HCNTiunazZYLzBElVvBsIq1txvjnJLCjiClRs9NWjxM=; b=YTVMNzf3e2QLEPvy1z+h+zOAWVuLqvcH9ukx2eVAT0DG39fwZQckZcdCQ6osLjvyqB W1Eok+rzydNg3x/ySgAS9J8GmnZj3T5ZFiXIpjlMQkfJ4QxtakPIIHeDDo5RhL3X4hYD BuruiWbOHSMJR1i99IwafVXZr28lq7wJOBDqF0LVqYIdF6HJKbk4UZYHshu1RYbpqUdN vx2u01UDuDZ0d7MOzQ8H+s+j9ooK/Xs8HQthOVRyvVWDnf61LUyOUaOuvI4JQhGHdqUj CFVWKljg59zWuINxFaakP2l3XJSxanVnzr1WyM4xwPQBd4gBJ2IcYI8uDYIDCGbIVI6j 0q6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=HCNTiunazZYLzBElVvBsIq1txvjnJLCjiClRs9NWjxM=; b=fEzNMX2m0ylOx/l6Q31w5g0u3rxDb6nVr9ZYdkjtn7fA6CkcJvmqH1gh3lKrHk6QZL 8Mydk27MAj+MO8o6yJBWfvXKpViesuPrq6SWpizNa1DrjqbDW0pm3Ygdw/zzcGrI6pxC GMEPpNqNU0Uf546UlzbOQm2AbYfphp/BrOe6VmcsoO51xTM5gdUZD7yLp6hczsh6KdK9 VuSXEdwBbfJ0J+sVW9KFW1B1bem1Y97EwZmtBmVjSCPqWKwu0LtSMIV95zXT7lsFDQQS as0ehOBYxEmVprK1Z+2ro5RNoK51848lbT/rDYI4ZotOsk3mOEyzOnnarY86y80Qk05+ aLTg== X-Gm-Message-State: AOAM5326Z9ZGuCb7bRw5DIyd1U8chShUVBJh5ywf+RgGd2fkF9mZdwIu +1SNTvco2SDQqYa9y+fUPM3J0s5DWkC8mmC6G7pL4A== X-Google-Smtp-Source: ABdhPJwJv5F6kb/IyvwHdQiLOnyG9BpA/BeaFV9eQQPU+tZlda6yHltHXnoTG1+5XGfPdzPrUrieOdTfXn72+CojqXw= X-Received: by 2002:ab0:634c:: with SMTP id f12mr13032319uap.63.1610377130152; Mon, 11 Jan 2021 06:58:50 -0800 (PST) MIME-Version: 1.0 References: <20210109043808.GA3694@localhost.localdomain> In-Reply-To: <20210109043808.GA3694@localhost.localdomain> From: Neal Cardwell Date: Mon, 11 Jan 2021 09:58:33 -0500 Message-ID: Subject: Re: [PATCH] Revert "tcp: simplify window probe aborting on USER_TIMEOUT" To: Enke Chen Cc: Eric Dumazet , "David S. Miller" , Alexey Kuznetsov , Hideaki YOSHIFUJI , Jakub Kicinski , Soheil Hassas Yeganeh , Yuchung Cheng , Netdev , LKML , Jonathan Maxwell , William McCall , enchen2020@gmail.com Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 8, 2021 at 11:38 PM Enke Chen wrote: > > From: Enke Chen > > This reverts commit 9721e709fa68ef9b860c322b474cfbd1f8285b0f. > > With the commit 9721e709fa68 ("tcp: simplify window probe aborting > on USER_TIMEOUT"), the TCP session does not terminate with > TCP_USER_TIMEOUT when data remain untransmitted due to zero window. > > The number of unanswered zero-window probes (tcp_probes_out) is > reset to zero with incoming acks irrespective of the window size, > as described in tcp_probe_timer(): > > RFC 1122 4.2.2.17 requires the sender to stay open indefinitely > as long as the receiver continues to respond probes. We support > this by default and reset icsk_probes_out with incoming ACKs. > > This counter, however, is the wrong one to be used in calculating the > duration that the window remains closed and data remain untransmitted. > Thanks to Jonathan Maxwell for diagnosing the > actual issue. > > Cc: stable@vger.kernel.org > Fixes: 9721e709fa68 ("tcp: simplify window probe aborting on USER_TIMEOUT") > Reported-by: William McCall > Signed-off-by: Enke Chen > --- I ran this revert commit through our packetdrill TCP tests, and it's causing failures in a ZWP/USER_TIMEOUT test due to interactions with this Jan 2019 patch: 7f12422c4873e9b274bc151ea59cb0cdf9415cf1 tcp: always timestamp on every skb transmission The issue seems to be that after 7f12422c4873 the skb->skb_mstamp_ns is set on every transmit attempt. That means that even skbs that are not successfully transmitted have a non-zero skb_mstamp_ns. That means that if ZWPs are repeatedly failing to be sent due to severe local qdisc congestion, then at this point in the code the start_ts is always only 500ms in the past (from TCP_RESOURCE_PROBE_INTERVAL = 500ms). That means that if there is severe local qdisc congestion a USER_TIMEOUT above 500ms is a NOP, and the socket can live far past the USER_TIMEOUT. It seems we need a slightly different approach than the revert in this commit. neal