From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 030A9C433DF for ; Thu, 20 Aug 2020 12:20:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DD1A82072D for ; Thu, 20 Aug 2020 12:20:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730399AbgHTMUi convert rfc822-to-8bit (ORCPT ); Thu, 20 Aug 2020 08:20:38 -0400 Received: from eu-smtp-delivery-151.mimecast.com ([207.82.80.151]:20602 "EHLO eu-smtp-delivery-151.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730475AbgHTMUg (ORCPT ); Thu, 20 Aug 2020 08:20:36 -0400 Received: from AcuMS.aculab.com (156.67.243.126 [156.67.243.126]) (Using TLS) by relay.mimecast.com with ESMTP id uk-mta-118-q-2yRy14OXaDjkU2wgyLYg-1; Thu, 20 Aug 2020 13:20:32 +0100 X-MC-Unique: q-2yRy14OXaDjkU2wgyLYg-1 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:43c:695e:880f:8750) by AcuMS.aculab.com (fd9f:af1c:a25b:0:43c:695e:880f:8750) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Thu, 20 Aug 2020 13:20:25 +0100 Received: from AcuMS.Aculab.com ([fe80::43c:695e:880f:8750]) by AcuMS.aculab.com ([fe80::43c:695e:880f:8750%12]) with mapi id 15.00.1347.000; Thu, 20 Aug 2020 13:20:25 +0100 From: David Laight To: 'Jakub Sitnicki' , Alexei Starovoitov CC: bpf , Network Development , kernel-team , Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski , Andrii Nakryiko , Lorenz Bauer , Marek Majkowski , Martin KaFai Lau , Yonghong Song Subject: RE: BPF sk_lookup v5 - TCP SYN and UDP 0-len flood benchmarks Thread-Topic: BPF sk_lookup v5 - TCP SYN and UDP 0-len flood benchmarks Thread-Index: AQHWdtzbHJLXFHCDqUy9ea5Q2RcL9qlA6Vow Date: Thu, 20 Aug 2020 12:20:25 +0000 Message-ID: References: <20200717103536.397595-1-jakub@cloudflare.com> <87lficrm2v.fsf@cloudflare.com> <87k0xtsj91.fsf@cloudflare.com> In-Reply-To: <87k0xtsj91.fsf@cloudflare.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0.001 X-Mimecast-Originator: aculab.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Jakub Sitnicki > Sent: 20 August 2020 11:30 > Subject: Re: BPF sk_lookup v5 - TCP SYN and UDP 0-len flood benchmarks > > On Tue, Aug 18, 2020 at 08:19 PM CEST, Alexei Starovoitov wrote: > > On Tue, Aug 18, 2020 at 8:49 AM Jakub Sitnicki wrote: > >> : rcu_read_lock(); > >> : run_array = rcu_dereference(net- > >bpf.run_array[NETNS_BPF_SK_LOOKUP]); > >> 0.01 : ffffffff817f8624: mov 0xd68(%r12),%rsi > >> : if (run_array) { > >> 0.00 : ffffffff817f862c: test %rsi,%rsi > >> 0.00 : ffffffff817f862f: je ffffffff817f87a9 <__udp4_lib_lookup+0x2c9> > >> : struct bpf_sk_lookup_kern ctx = { > >> 1.05 : ffffffff817f8635: xor %eax,%eax > >> 0.00 : ffffffff817f8637: mov $0x6,%ecx > >> 0.01 : ffffffff817f863c: movl $0x110002,0x40(%rsp) > >> 0.00 : ffffffff817f8644: lea 0x48(%rsp),%rdi > >> 18.76 : ffffffff817f8649: rep stos %rax,%es:(%rdi) > >> 1.12 : ffffffff817f864c: mov 0xc(%rsp),%eax > >> 0.00 : ffffffff817f8650: mov %ebp,0x48(%rsp) > >> 0.00 : ffffffff817f8654: mov %eax,0x44(%rsp) > >> 0.00 : ffffffff817f8658: movzwl 0x10(%rsp),%eax > >> 1.21 : ffffffff817f865d: mov %ax,0x60(%rsp) > >> 0.00 : ffffffff817f8662: movzwl 0x20(%rsp),%eax > >> 0.00 : ffffffff817f8667: mov %ax,0x62(%rsp) > >> : .sport = sport, > >> : .dport = dport, > >> : }; > > > > Such heavy hit to zero init 56-byte structure is surprising. > > There are two 4-byte holes in this struct. You can try to pack it and > > make sure that 'rep stoq' is used instead of 'rep stos' (8 byte at a time vs 4). > > Thanks for the tip. I'll give it a try. You probably don't want to use 'rep stos' in any of its forms. The instruction 'setup' time is horrid on most cpu variants. For a 48 byte structure six writes of a zero register will be faster. If gcc is generating the 'rep stos' then the compiler source code for that pessimisation needs deleting... David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)