From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D828C4332D for ; Thu, 4 Mar 2021 00:17:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F28E064EEE for ; Thu, 4 Mar 2021 00:17:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1450129AbhCDAPw (ORCPT ); Wed, 3 Mar 2021 19:15:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41650 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1389334AbhCCVhN (ORCPT ); Wed, 3 Mar 2021 16:37:13 -0500 Received: from mail-io1-xd2b.google.com (mail-io1-xd2b.google.com [IPv6:2607:f8b0:4864:20::d2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 86D3EC061756 for ; Wed, 3 Mar 2021 13:36:05 -0800 (PST) Received: by mail-io1-xd2b.google.com with SMTP id k2so21989627ioh.5 for ; Wed, 03 Mar 2021 13:36:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=nGmVks42vS+S7OW12Ssfu9opeTt4aGtiJwwc4G2Hx0s=; b=OKzoUJKEBpxj9wOVzNvRL0M9qmFGhCRmq6irrLysUcyetexbd8aLos3eNZv6xq4T68 b41g12znP6CFCDULReZP5h0+cm/nso8B4OXm0wWhMsCp96OS0qWuSzzGS3YFdsdibeWi FdhSPWAf4NJuzQfXi/RsoYLtrHv+yRiIxuUf2heSJEod1JZ2EMYoGclBV2b0WTPEXxdW ZHFmpfgCK0jMFjoNkjDGsY2BZ6Dm2ueQ8ViWbw6jqitROoCSWIz3F2ps6eOcrVA55R99 6G2cC/CPRczRMsbdRyaKqxEBYoWTtlc/36Ls/F+jMg5DpQHXP4zz4/tRN80efdgse3tg IIlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=nGmVks42vS+S7OW12Ssfu9opeTt4aGtiJwwc4G2Hx0s=; b=nbN7ZeE+oS+SZuq9KpzEQZyn1BsO0ewEhS31+3fz/ixzurqBxxA4MrXej1ISCqtt7u rJmipm5AKA2p9o0ViHYKJzhVM/AJwro7CKZUGp2jfvBS2J39Ge3mI2lD8zAMrtzhDhCj ZtLvTQ0krvOeyQhdyNvkr7s2ulVSh+tWagxMXTxDhExmIUH86uPMXmdWK60cn0TZtreZ 4SZW2D2HM6NbctN16U0ttDTg5bfrQUMhcjf4jzDl0zkAs8o8vkytrQVWS/MzYoAIotPd kfnQzyWapv51brwV1eP3aA/AzsOc48cN1FNd/tspvzJNMq6NekveGQkIVozdWUFEf92/ KZbg== X-Gm-Message-State: AOAM533YxYC5QJLrgrw8OiBqjzhAXZnPoBwNilKC9MUghgsg1XW6+Mvo aSjmRiMDl/mjuuNqHfI5zOklJ6VuRboNIyQqNJY= X-Google-Smtp-Source: ABdhPJze11NFGRPF92kbYowLGzP75qCjjAhkAsHhnNTljyfvxcMYDTj2SCWRnytlnJdQEdpasx/YnuL7f/cIOancEL4= X-Received: by 2002:a5d:81c8:: with SMTP id t8mr1115639iol.38.1614807364687; Wed, 03 Mar 2021 13:36:04 -0800 (PST) MIME-Version: 1.0 References: <20210302060753.953931-1-kuba@kernel.org> In-Reply-To: From: Alexander Duyck Date: Wed, 3 Mar 2021 13:35:53 -0800 Message-ID: Subject: Re: [PATCH net] net: tcp: don't allocate fast clones for fastopen SYN To: Eric Dumazet Cc: Jakub Kicinski , David Miller , netdev , kernel-team , Neil Spring Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Tue, Mar 2, 2021 at 1:37 PM Eric Dumazet wrote: > > On Tue, Mar 2, 2021 at 7:08 AM Jakub Kicinski wrote: > > > > When receiver does not accept TCP Fast Open it will only ack > > the SYN, and not the data. We detect this and immediately queue > > the data for (re)transmission in tcp_rcv_fastopen_synack(). > > > > In DC networks with very low RTT and without RFS the SYN-ACK > > may arrive before NIC driver reported Tx completion on > > the original SYN. In which case skb_still_in_host_queue() > > returns true and sender will need to wait for the retransmission > > timer to fire milliseconds later. > > > > Revert back to non-fast clone skbs, this way > > skb_still_in_host_queue() won't prevent the recovery flow > > from completing. > > > > Suggested-by: Eric Dumazet > > Fixes: 355a901e6cf1 ("tcp: make connect() mem charging friendly") > > Hmmm, not sure if this Fixes: tag makes sense. > > Really, if we delay TX completions by say 10 ms, other parts of the > stack will misbehave anyway. > > Also, backporting this patch up to linux-3.19 is going to be tricky. > > The real issue here is that skb_still_in_host_queue() can give a false positive. > > I have mixed feelings here, as you can read my answer :/ > > Maybe skb_still_in_host_queue() signal should not be used when a part > of the SKB has been received/acknowledged by the remote peer > (in this case the SYN part). > > Alternative is that drivers unable to TX complete their skbs in a > reasonable time should call skb_orphan() > to avoid skb_unclone() penalties (and this skb_still_in_host_queue() issue) > > If you really want to play and delay TX completions, maybe provide a > way to disable skb_still_in_host_queue() globally, > using a static key ? The problem as I see it is that the original fclone isn't what we sent out on the wire and that is confusing things. What we sent was a SYN with data, but what we have now is just a data frame that hasn't been put out on the wire yet. I wonder if we couldn't get away with doing something like adding a fourth option of SKB_FCLONE_MODIFIED that we could apply to fastopen skbs? That would keep the skb_still_in_host queue from triggering as we would be changing the state from SKB_FCLONE_ORIG to SKB_FCLONE_MODIFIED for the skb we store in the retransmit queue. In addition if we have to clone it again and the fclone reference count is 1 we could reset it back to SKB_FCLONE_ORIG.