From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C3DAC31681 for ; Mon, 21 Jan 2019 18:59:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E59FD20861 for ; Mon, 21 Jan 2019 18:59:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727940AbfAUS7o convert rfc822-to-8bit (ORCPT ); Mon, 21 Jan 2019 13:59:44 -0500 Received: from mx1.redhat.com ([209.132.183.28]:40408 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727490AbfAUS7n (ORCPT ); Mon, 21 Jan 2019 13:59:43 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 76DF3C058CB0; Mon, 21 Jan 2019 18:59:43 +0000 (UTC) Received: from localhost (ovpn-200-36.brq.redhat.com [10.40.200.36]) by smtp.corp.redhat.com (Postfix) with ESMTP id D4B7860FE1; Mon, 21 Jan 2019 18:59:37 +0000 (UTC) Date: Mon, 21 Jan 2019 19:59:36 +0100 From: Jesper Dangaard Brouer To: bjorn.topel@gmail.com Cc: intel-wired-lan@lists.osuosl.org, =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, magnus.karlsson@gmail.com, netdev@vger.kernel.org, brouer@redhat.com Subject: Re: [PATCH] i40e: replace switch-statement with if-clause Message-ID: <20190121195936.0badfb33@redhat.com> In-Reply-To: <20190121163356.31332-1-bjorn.topel@gmail.com> References: <20190121163356.31332-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Mon, 21 Jan 2019 18:59:43 +0000 (UTC) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Mon, 21 Jan 2019 17:33:56 +0100 bjorn.topel@gmail.com wrote: > From: Björn Töpel > > GCC will generate jump tables for switch-statements with more than 5 > case statements. An entry into the jump table is an indirect call, > which means that for CONFIG_RETPOLINE builds, this is rather > expensive. > > This commit replaces the switch-statement that acts on the XDP program > result with an if-clause. > > The if-clause was also refactored into a common function that can be > used by AF_XDP zero-copy and non-zero-copy code. > > Performance prior this patch: > $ sudo ./xdp_rxq_info --dev enp134s0f0 --action XDP_DROP > Running XDP on dev:enp134s0f0 (ifindex:7) action:XDP_DROP options:no_touch > XDP stats CPU pps issue-pps > XDP-RX CPU 20 18983018 0 > XDP-RX CPU total 18983018 > > RXQ stats RXQ:CPU pps issue-pps > rx_queue_index 20:20 18983012 0 > rx_queue_index 20:sum 18983012 > > $ sudo ./xdpsock -i enp134s0f0 -q 20 -n 2 -z -r > sock0@enp134s0f0:20 rxdrop > pps pkts 2.00 > rx 14,641,496 144,751,092 > tx 0 0 > > And after: > $ sudo ./xdp_rxq_info --dev enp134s0f0 --action XDP_DROP > Running XDP on dev:enp134s0f0 (ifindex:7) action:XDP_DROP options:no_touch > XDP stats CPU pps issue-pps > XDP-RX CPU 20 24000986 0 > XDP-RX CPU total 24000986 > > RXQ stats RXQ:CPU pps issue-pps > rx_queue_index 20:20 24000985 0 > rx_queue_index 20:sum 24000985 > > +26% > > $ sudo ./xdpsock -i enp134s0f0 -q 20 -n 2 -z -r > sock0@enp134s0f0:20 rxdrop > pps pkts 2.00 > rx 17,623,578 163,503,263 > tx 0 0 > > +20% The saving/cost of the retpoline is around 11 nanosec, which corresponds well with my previous experience and microbenchmarking around 12 ns. ((1/18983012)-(1/24000986))*10^9 11.01372430029000000000 nanosec ((1/14641496)-(1/17623578))*10^9 11.55686507951000000000 nanosec -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jesper Dangaard Brouer Date: Mon, 21 Jan 2019 19:59:36 +0100 Subject: [Intel-wired-lan] [PATCH] i40e: replace switch-statement with if-clause In-Reply-To: <20190121163356.31332-1-bjorn.topel@gmail.com> References: <20190121163356.31332-1-bjorn.topel@gmail.com> Message-ID: <20190121195936.0badfb33@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: intel-wired-lan@osuosl.org List-ID: On Mon, 21 Jan 2019 17:33:56 +0100 bjorn.topel at gmail.com wrote: > From: Bj?rn T?pel > > GCC will generate jump tables for switch-statements with more than 5 > case statements. An entry into the jump table is an indirect call, > which means that for CONFIG_RETPOLINE builds, this is rather > expensive. > > This commit replaces the switch-statement that acts on the XDP program > result with an if-clause. > > The if-clause was also refactored into a common function that can be > used by AF_XDP zero-copy and non-zero-copy code. > > Performance prior this patch: > $ sudo ./xdp_rxq_info --dev enp134s0f0 --action XDP_DROP > Running XDP on dev:enp134s0f0 (ifindex:7) action:XDP_DROP options:no_touch > XDP stats CPU pps issue-pps > XDP-RX CPU 20 18983018 0 > XDP-RX CPU total 18983018 > > RXQ stats RXQ:CPU pps issue-pps > rx_queue_index 20:20 18983012 0 > rx_queue_index 20:sum 18983012 > > $ sudo ./xdpsock -i enp134s0f0 -q 20 -n 2 -z -r > sock0 at enp134s0f0:20 rxdrop > pps pkts 2.00 > rx 14,641,496 144,751,092 > tx 0 0 > > And after: > $ sudo ./xdp_rxq_info --dev enp134s0f0 --action XDP_DROP > Running XDP on dev:enp134s0f0 (ifindex:7) action:XDP_DROP options:no_touch > XDP stats CPU pps issue-pps > XDP-RX CPU 20 24000986 0 > XDP-RX CPU total 24000986 > > RXQ stats RXQ:CPU pps issue-pps > rx_queue_index 20:20 24000985 0 > rx_queue_index 20:sum 24000985 > > +26% > > $ sudo ./xdpsock -i enp134s0f0 -q 20 -n 2 -z -r > sock0 at enp134s0f0:20 rxdrop > pps pkts 2.00 > rx 17,623,578 163,503,263 > tx 0 0 > > +20% The saving/cost of the retpoline is around 11 nanosec, which corresponds well with my previous experience and microbenchmarking around 12 ns. ((1/18983012)-(1/24000986))*10^9 11.01372430029000000000 nanosec ((1/14641496)-(1/17623578))*10^9 11.55686507951000000000 nanosec -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer