From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AC3AC5B578 for ; Fri, 28 Jun 2019 20:25:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1E52D208CB for ; Fri, 28 Jun 2019 20:25:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=netronome-com.20150623.gappssmtp.com header.i=@netronome-com.20150623.gappssmtp.com header.b="QzE1Kzev" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727148AbfF1UZW (ORCPT ); Fri, 28 Jun 2019 16:25:22 -0400 Received: from mail-qt1-f194.google.com ([209.85.160.194]:42555 "EHLO mail-qt1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727147AbfF1UZV (ORCPT ); Fri, 28 Jun 2019 16:25:21 -0400 Received: by mail-qt1-f194.google.com with SMTP id s15so7774085qtk.9 for ; Fri, 28 Jun 2019 13:25:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :organization:mime-version:content-transfer-encoding; bh=YgOkvMRbAoBV9RdEWjxdL5LcNeoNsF2571Oh25CeOBw=; b=QzE1KzevnHvw2DlPmsGsEfdOepxU79Es3UiyhdeOMln6yJzTbKeyitv9bFSDwj3Rx9 66XPQkE8+TsusPrbHla/JjYTLX/tFvHUMEutclrzhlEuVSjm0/nF2I4uaBKTVPhyVL/E FS/05Djg9y/041KBPy3cgG/3AS3mxCiH7Vd5NSdENcfZ1SJ37dB0jpMoKuWYldGM+YBY oXdQ/71UBOX2O+4V3A5uo2wezvd/0HCZbGB8lJ1LFl8VDGahWa6AJ/Ez8B/qf4j/Fk5y Av+RrkoQc/kUXMvimtRI6d0CG/9Bks/3EpPvUMjPkmC5RWm8goEPdjw3b1PSCarr3TVg Cz3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:organization:mime-version:content-transfer-encoding; bh=YgOkvMRbAoBV9RdEWjxdL5LcNeoNsF2571Oh25CeOBw=; b=f/nni0l8/liySwkB0ObekVX4g8HPn66/o4rLcVYFpsrf2D81QPI10gv2c+yVGIxLf5 nXjRIcn8tWpxdgPifLb9lsf68pboUOlSPhDcBJjJ0B0QOTovKbcKBrQJ/g8uhJZLksSg 438Mr+r+gsKhPvfR+WolNsOc2TZYOAyV38ky4p4N9kEamX7S2Q0GLZt5JcJDAIecSDsx nMGMI64PV/bU70ZQctRk+Mnb3lCqgqwDEUcz8Zq2XT2HBv8pnAtJAGPuoB/2EibhlQa+ f50Aw3bTpG5og2DXHNFQiDbVHbPsJ9bBxU2jnNSuuUPdEMjWDd2/vAFEl5HV0WMbbskj dUug== X-Gm-Message-State: APjAAAXdwKtYnNmhH1Is8e3QQ133CzyDTt6JhOQ6fs0kXaokH4SmmIr9 oYKQoLiWQqvDV94M8XrvPM21KHxIFv8= X-Google-Smtp-Source: APXvYqxYTPmzEj683koEfyjLlMd0M3Tt8zuPQO5h4/CE4XS5HzFO8KqqjrXRaM6yHId7odT4p28uog== X-Received: by 2002:a0c:b90a:: with SMTP id u10mr10012395qvf.201.1561753520824; Fri, 28 Jun 2019 13:25:20 -0700 (PDT) Received: from cakuba.netronome.com ([66.60.152.14]) by smtp.gmail.com with ESMTPSA id f132sm1519910qke.88.2019.06.28.13.25.19 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Fri, 28 Jun 2019 13:25:20 -0700 (PDT) Date: Fri, 28 Jun 2019 13:25:16 -0700 From: Jakub Kicinski To: "Laatz, Kevin" Cc: Jonathan Lemon , netdev@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, bjorn.topel@intel.com, magnus.karlsson@intel.com, bpf@vger.kernel.org, intel-wired-lan@lists.osuosl.org, bruce.richardson@intel.com, ciara.loftus@intel.com Subject: Re: [PATCH 00/11] XDP unaligned chunk placement support Message-ID: <20190628132516.723ef517@cakuba.netronome.com> In-Reply-To: References: <20190620083924.1996-1-kevin.laatz@intel.com> <20190627142534.4f4b8995@cakuba.netronome.com> Organization: Netronome Systems, Ltd. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Fri, 28 Jun 2019 17:19:09 +0100, Laatz, Kevin wrote: > On 27/06/2019 22:25, Jakub Kicinski wrote: > > On Thu, 27 Jun 2019 12:14:50 +0100, Laatz, Kevin wrote: =20 > >> On the application side (xdpsock), we don't have to worry about the us= er > >> defined headroom, since it is 0, so we only need to account for the > >> XDP_PACKET_HEADROOM when computing the original address (in the default > >> scenario). =20 > > That assumes specific layout for the data inside the buffer. Some NICs > > will prepend information like timestamp to the packet, meaning the > > packet would start at offset XDP_PACKET_HEADROOM + metadata len.. =20 >=20 > Yes, if NICs prepend extra data to the packet that would be a problem for > using this feature in isolation. However, if we also add in support for=20 > in-order RX and TX rings, that would no longer be an issue. Can you shed more light on in-order rings? Do you mean that RX frames come in order buffers were placed in the fill queue? That wouldn't make practical sense, no? Even if the application does no reordering there is also XDP_DROP and XDP_TX. Please explain :) > However, even for NICs which do prepend data, this patchset should > not break anything that is currently working. My understanding from the beginnings of AF_XDP was that we were searching for a format flexible enough to support most if not all NICs. Creating an ABI which will preclude vendors from supporting DPDK via AF_XDP would seriously undermine the neutrality aspect. > > I think that's very limiting. What is the challenge in providing > > aligned addresses, exactly? =20 > The challenges are two-fold: > 1) it prevents using arbitrary buffer sizes, which will be an issue=20 > supporting e.g. jumbo frames in future. Presumably support for jumbos would require a multi-buffer setup, and therefore extensions to the ring format. Should we perhaps look into implementing unaligned chunks by extending ring format as well? > 2) higher level user-space frameworks which may want to use AF_XDP, such= =20 > as DPDK, do not currently support having buffers with 'fixed' alignment. > =C2=A0=C2=A0=C2=A0 The reason that DPDK uses arbitrary placement is that: > =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 - it would stop things working on = certain NICs which need the=20 > actual writable space specified in units of 1k - therefore we need 2k +=20 > metadata space. > =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 - we place padding between buffers= to avoid constantly hitting=20 > the same memory channels when accessing memory. > =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0=C2=A0 - it allows the application to cho= ose the actual buffer size it=20 > wants to use. > =C2=A0=C2=A0=C2=A0 We make use of the above to allow us to speed up proc= essing=20 > significantly and also reduce the packet buffer memory size. >=20 > =C2=A0=C2=A0=C2=A0 Not having arbitrary buffer alignment also means an A= F_XDP driver=20 > for DPDK cannot be a drop-in replacement for existing drivers in those=20 > frameworks. Even with a new capability to allow an arbitrary buffer=20 > alignment, existing apps will need to be modified to use that new=20 > capability.