From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.6 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39BA8C10F11 for ; Wed, 24 Apr 2019 15:47:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EFE2520878 for ; Wed, 24 Apr 2019 15:47:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="L0pgzExz" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731766AbfDXPrm (ORCPT ); Wed, 24 Apr 2019 11:47:42 -0400 Received: from mail-yw1-f67.google.com ([209.85.161.67]:33503 "EHLO mail-yw1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730627AbfDXPrl (ORCPT ); Wed, 24 Apr 2019 11:47:41 -0400 Received: by mail-yw1-f67.google.com with SMTP id q11so3017242ywb.0 for ; Wed, 24 Apr 2019 08:47:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=L6I/s3d2wAR25329qet1M6og07sYZWtPMJSdqCZbDNs=; b=L0pgzExzEpwVH60E+Jzp0TbWzl8zs2xOZDl8TGib13DUsLAbj31Msf5OAerBLrN+Ud z9V3ipf5GM6Zzw1JKduvqhgZnZWV/CNTPHmgKGUliILeTc9F69MR4LIuntMVqxOrJ/Po MITjuFSN9ANBc1uYmnb38DPC2iC4FUgEBFFAz1qJqBOcXOab9TyArQXdOPStfxk4rPm3 YK2DxsT1LQEwXBlTlydw9jUDwHvMUvgxIaynGCp1in41tMdwQjMCaN42tI4EYfWMe4PH Dk0kcVlIurWX85NLuCHCDS/pCLXyfUSXDdLZHFYVlBeI+o62mJb1E2+LrZwjp8Dvyq5r 7yCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=L6I/s3d2wAR25329qet1M6og07sYZWtPMJSdqCZbDNs=; b=N+bhOoGu8i4COJzaXJRlkLer7BZ8zOakfOiQ9yM+hByAQ4puB5YBgNmLcDxU80K+mP 3NadW5Q71XcVc3TazKVmAGRoFe9mljDYQAscbyYPZ0pX7qAmNt/hp/LuElkIu0nCs59J 5NxgUi1fSY4ZLtd3LbTMvPz7xam8LM6NevDWR6b97GeF0kxOuDYdtOSAdkHpgVPBCv8B FutCq9DfcqAPPJTMlpmTuq+56q8ELjbQYSNQYDhk86GUZlb5+PGaUnM3mH5ISnFUhdd3 V8Fns8sCSBHdvPhz2pzKhmk12KDeEpLSUzACI+S/XOGpR/MPX/eWTpczFyOeu1qO1Pnn YDZQ== X-Gm-Message-State: APjAAAWBPCp/zrAZTt6LL2gzyDjtiK3BbhJasK3r5ElTz/w/tkdry/it xDuBidyRBn9iNvD6+lA0WIRx0QO7k7U2ab2/72+nFw== X-Google-Smtp-Source: APXvYqyO94H/FnoNpC862RpW2K70DdvfXYHTUHm1A3u6ZKnSHlHq8nhDmGRHtO4T0FEeYcsTUDddzzLlkxgMJEcv7Y4= X-Received: by 2002:a81:55d0:: with SMTP id j199mr6044022ywb.91.1556120859980; Wed, 24 Apr 2019 08:47:39 -0700 (PDT) MIME-Version: 1.0 References: <85aabf9d4f41b6c57629e736993233f80a037e59.camel@linuxfoundation.org> <20190424165150.1420b046@pluto.restena.lu> In-Reply-To: <20190424165150.1420b046@pluto.restena.lu> From: Eric Dumazet Date: Wed, 24 Apr 2019 08:47:27 -0700 Message-ID: Subject: Re: [PATCH net-next 2/3] tcp: implement coalescing on backlog queue To: =?UTF-8?Q?Bruno_Pr=C3=A9mont?= Cc: richard.purdie@linuxfoundation.org, Neal Cardwell , Yuchung Cheng , "David S. Miller" , netdev , Alexander Kanavin , Bruce Ashfield Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Wed, Apr 24, 2019 at 7:51 AM Bruno Pr=C3=A9mont wro= te: > > Hi Eric, > > I'm seeing issues with this patch as well, not as regular as for > Richard but still (about up to one in 30-50 TCP sessions). > > In my case I have a virtual machine (on VMWare) with this patch where > NGINX as reverse proxy misses part (end) of payload from its upstream > and times out on the upstream connection (while according to tcpdump all > packets including upstream's FIN were sent and the upstream did get > ACKs from the VM). > > From when browsers get from NGINX it feels as if at some point reading > from the socket or waiting for data using select() never returned data > that arrived as more than just EOF is missing. > > The upstream is a hardware machine in the same subnet. > > My VM is using VMware VMXNET3 Ethernet Controller [15ad:07b0] (rev 01) > as network adapter which lists the following features: > Hi Bruno. I suspect a EPOLLIN notification being lost by the application. Fact that TCP backlog contains 1 instead of 2+ packets should not change stack behavior, this packet should land into socket receive queue eventually. Are you using epoll() in Edge Trigger mode. You mention select() but select() is a rather old and inefficient API. Could you watch/report the output of " ss -temoi " for the frozen TCP flow= ? This migtht give us a clue about packets being dropped, say the the accumulated packet became too big. > rx-checksumming: on > tx-checksumming: on > tx-checksum-ipv4: off [fixed] > tx-checksum-ip-generic: on > tx-checksum-ipv6: off [fixed] > tx-checksum-fcoe-crc: off [fixed] > tx-checksum-sctp: off [fixed] > scatter-gather: on > tx-scatter-gather: on > tx-scatter-gather-fraglist: off [fixed] > tcp-segmentation-offload: on > tx-tcp-segmentation: on > tx-tcp-ecn-segmentation: off [fixed] > tx-tcp-mangleid-segmentation: off > tx-tcp6-segmentation: on > udp-fragmentation-offload: off > generic-segmentation-offload: on > generic-receive-offload: on > large-receive-offload: on > rx-vlan-offload: on > tx-vlan-offload: on > ntuple-filters: off [fixed] > receive-hashing: off [fixed] > highdma: on > rx-vlan-filter: on [fixed] > vlan-challenged: off [fixed] > tx-lockless: off [fixed] > netns-local: off [fixed] > tx-gso-robust: off [fixed] > tx-fcoe-segmentation: off [fixed] > tx-gre-segmentation: off [fixed] > tx-gre-csum-segmentation: off [fixed] > tx-ipxip4-segmentation: off [fixed] > tx-ipxip6-segmentation: off [fixed] > tx-udp_tnl-segmentation: off [fixed] > tx-udp_tnl-csum-segmentation: off [fixed] > tx-gso-partial: off [fixed] > tx-sctp-segmentation: off [fixed] > tx-esp-segmentation: off [fixed] > tx-udp-segmentation: off [fixed] > fcoe-mtu: off [fixed] > tx-nocache-copy: off > loopback: off [fixed] > rx-fcs: off [fixed] > rx-all: off [fixed] > tx-vlan-stag-hw-insert: off [fixed] > rx-vlan-stag-hw-parse: off [fixed] > rx-vlan-stag-filter: off [fixed] > l2-fwd-offload: off [fixed] > hw-tc-offload: off [fixed] > esp-hw-offload: off [fixed] > esp-tx-csum-hw-offload: off [fixed] > rx-udp_tunnel-port-offload: off [fixed] > tls-hw-tx-offload: off [fixed] > tls-hw-rx-offload: off [fixed] > rx-gro-hw: off [fixed] > tls-hw-record: off [fixed] > > > I can reproduce the issue with kernels 5.0.x and as recent as 5.1-rc6. > > Cheers, > Bruno > > On Sunday, April 7, 2019 11:28:30 PM CEST, richard.purdie@linuxfoundation= .org wrote: > > Hi, > > > > I've been chasing down why a python test from the python3 testsuite > > started failing and it seems to point to this kernel change in the > > networking stack. > > > > In kernels beyond commit 4f693b55c3d2d2239b8a0094b518a1e533cf75d5 the > > test hangs about 90% of the time (I've reproduced with 5.1-rc3, 5.0.7, > > 5.0-rc1 but not 4.18, 4.19 or 4.20). The reproducer is: > > > > $ python3 -m test test_httplib -v > > =3D=3D CPython 3.7.2 (default, Apr 5 2019, 15:17:15) [GCC 8.3.0] > > =3D=3D Linux-5.0.0-yocto-standard-x86_64-with-glibc2.2.5 little-endian > > =3D=3D cwd: /var/volatile/tmp/test_python_288 > > =3D=3D CPU count: 1 > > =3D=3D encodings: locale=3DUTF-8, FS=3Dutf-8 > > [...] > > test_response_fileno (test.test_httplib.BasicTest) ... > > > > and it hangs in test_response_fileno. > > > > The test in question comes from Lib/test/test_httplib.py in the python > > source tree and the code is: > > > > def test_response_fileno(self): > > # Make sure fd returned by fileno is valid. > > serv =3D socket.socket( > > socket.AF_INET, socket.SOCK_STREAM, socket.IPPROTO_TCP) > > self.addCleanup(serv.close) > > serv.bind((HOST, 0)) > > serv.listen() > > > > result =3D None > > def run_server(): > > [conn, address] =3D serv.accept() > > with conn, conn.makefile("rb") as reader: > > # Read the request header until a blank line > > while True: > > line =3D reader.readline() > > if not line.rstrip(b"\r\n"): > > break > > conn.sendall(b"HTTP/1.1 200 Connection established\r\n\= r\n") > > nonlocal result > > result =3D reader.read() > > > > thread =3D threading.Thread(target=3Drun_server) > > thread.start() > > self.addCleanup(thread.join, float(1)) > > conn =3D client.HTTPConnection(*serv.getsockname()) > > conn.request("CONNECT", "dummy:1234") > > response =3D conn.getresponse() > > try: > > self.assertEqual(response.status, client.OK) > > s =3D socket.socket(fileno=3Dresponse.fileno()) > > try: > > s.sendall(b"proxied data\n") > > finally: > > s.detach() > > finally: > > response.close() > > conn.close() > > thread.join() > > self.assertEqual(result, b"proxied data\n") > > > > I was hoping someone with more understanding of the networking stack > > could look at this and tell whether its a bug in the python test, the > > kernel change or otherwise give a pointer to where the problem might > > be? I'll freely admit this is not an area I know much about. > > > > Cheers, > > > > Richard > > > > > >