From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE7E4C3A589 for ; Fri, 16 Aug 2019 01:28:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 990B92086C for ; Fri, 16 Aug 2019 01:28:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="D1+ILqyO" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726349AbfHPB2S (ORCPT ); Thu, 15 Aug 2019 21:28:18 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:35862 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726329AbfHPB2R (ORCPT ); Thu, 15 Aug 2019 21:28:17 -0400 Received: by mail-pf1-f194.google.com with SMTP id w2so2253839pfi.3; Thu, 15 Aug 2019 18:28:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=Gz9ph6cuUFzx7xIV4suHXvoUxpw716k2jFUVdSsCO8w=; b=D1+ILqyOnVKvWtZZFjq4yhI7VqhH//j5fRrFGMfyYRbNoiCuG3K+wIv97sf0liW3zy 3pLEtzL6fgZC7uQU4ZAOzwHwktvSqjHO8dVN6pMiRgbK0KH7ZHWfsKxpApX2kIKnyDQi IW+V0KeBmOStAJtLg0xzeKu0Qab9cRePe+6FPO0fCESNgwfPm7uu2B6VYj4DN2SvXxHx gi+ZMhkCyutvQusUfizYEnhTBmBw1QQC67Nm7h+tc5I/JjtFB3dK8Q4o2YFJ2NHja6mh zO9aNkArMbLu5pWi9vD4W2H4rUx7WEIgEUZrGMUr2znxVNZgNN3YPvq8Htsw8I00Vjan QLzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=Gz9ph6cuUFzx7xIV4suHXvoUxpw716k2jFUVdSsCO8w=; b=ocgTjoi7kQ8SWDIBNjHvFsIOfiIRT0HJ0B2Jmt3jsZ4JdgsGPiWpgGnFkDYqsiT7Fp ApDlE252Py/D+fXIbood3kFUE1TqX6yZCD9RucT9DvMcLm7QMc4A2nuZ7qxYiyzgm/0q TntCf2mfrXhh2f8vQxC3pwqTR4fclM2yideonmYsQvD+iYIo0NLStEUL179T0a9thX2p OX+LmK1ieisJJu+C9DSSLILfbIhhGKyyiTPct5V95HhbBmiY0cYxaSWmxzP4vPtgYH7c Uo/oUtGTm6v7XXndvmLsyiKEeZBEYIlq4jX86HUHzPUitnZ90cK2KY1XIq4QFMLhPBtp GyuA== X-Gm-Message-State: APjAAAVfPiZ3BVj3FZT5pe0Qq6qn/K7NmDor4QxUkPzDUqVBBRzJ/2Zb mKF1yLa1P5Gy2lWyYbffidw= X-Google-Smtp-Source: APXvYqwrbNeO4nIK90W12T4w3uAQzY95iCV+gQAvBjBul0BluPkBg3oqBWU43o/TvvOagTcu1V05OQ== X-Received: by 2002:a17:90a:1110:: with SMTP id d16mr4933724pja.29.1565918897136; Thu, 15 Aug 2019 18:28:17 -0700 (PDT) Received: from [172.20.20.103] ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id e9sm3848292pge.39.2019.08.15.18.28.13 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 15 Aug 2019 18:28:16 -0700 (PDT) Subject: Re: [RFC PATCH bpf-next 00/14] xdp_flow: Flow offload to XDP To: Jakub Kicinski , Stanislav Fomichev Cc: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> <20190814170715.GJ2820@mini-arch> <14c4a876-6f5d-4750-cbe4-19622f64975b@gmail.com> <20190815152100.GN2820@mini-arch> <20190815122232.4b1fa01c@cakuba.netronome.com> From: Toshiaki Makita Message-ID: Date: Fri, 16 Aug 2019 10:28:10 +0900 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <20190815122232.4b1fa01c@cakuba.netronome.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On 2019/08/16 4:22, Jakub Kicinski wrote: > On Thu, 15 Aug 2019 08:21:00 -0700, Stanislav Fomichev wrote: >> On 08/15, Toshiaki Makita wrote: >>> On 2019/08/15 2:07, Stanislav Fomichev wrote: >>>> On 08/13, Toshiaki Makita wrote: >>>>> * Implementation >>>>> >>>>> xdp_flow makes use of UMH to load an eBPF program for XDP, similar to >>>>> bpfilter. The difference is that xdp_flow does not generate the eBPF >>>>> program dynamically but a prebuilt program is embedded in UMH. This is >>>>> mainly because flow insertion is considerably frequent. If we generate >>>>> and load an eBPF program on each insertion of a flow, the latency of the >>>>> first packet of ping in above test will incease, which I want to avoid. >>>> Can this be instead implemented with a new hook that will be called >>>> for TC events? This hook can write to perf event buffer and control >>>> plane will insert/remove/modify flow tables in the BPF maps (contol >>>> plane will also install xdp program). >>>> >>>> Why do we need UMH? What am I missing? >>> >>> So you suggest doing everything in xdp_flow kmod? >> You probably don't even need xdp_flow kmod. Add new tc "offload" mode >> (bypass) that dumps every command via netlink (or calls the BPF hook >> where you can dump it into perf event buffer) and then read that info >> from userspace and install xdp programs and modify flow tables. >> I don't think you need any kernel changes besides that stream >> of data from the kernel about qdisc/tc flow creation/removal/etc. > > There's a certain allure in bringing the in-kernel BPF translation > infrastructure forward. OTOH from system architecture perspective IMHO > it does seem like a task best handed in user space. bpfilter can replace > iptables completely, here we're looking at an acceleration relatively > loosely coupled with flower. I don't think it's loosely coupled. Emulating TC behavior in userspace is not so easy. Think about recent multi-mask support in flower. Previously userspace could assume there is one mask and hash table for each preference in TC. After the change TC accepts different masks with the same pref. Such a change tends to break userspace emulation. It may ignore masks passed from flow insertion and use the mask remembered when the first flow of the pref is inserted. It may override the mask of all existing flows with the pref. It may fail to insert such flows. Any of them would result in unexpected wrong datapath handling which is critical. I think such an emulation layer needs to be updated in sync with TC. Toshiaki Makita