From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 416CCC4321A for ; Fri, 28 Jun 2019 20:29:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0F73720828 for ; Fri, 28 Jun 2019 20:29:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="l+e/PGzp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727182AbfF1U3F (ORCPT ); Fri, 28 Jun 2019 16:29:05 -0400 Received: from mail-pl1-f196.google.com ([209.85.214.196]:41724 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727042AbfF1U3E (ORCPT ); Fri, 28 Jun 2019 16:29:04 -0400 Received: by mail-pl1-f196.google.com with SMTP id m7so3844757pls.8; Fri, 28 Jun 2019 13:29:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=bnEv+X6GYtFiiNujnuspHfQ9vua0ZpFbOcCMm5XfbHM=; b=l+e/PGzpskxEFOlTnLnrRwwaulooIqK74WgIkScA6HRkwzfRCZ/7B1LKz5weRN4t/M /YBioajcNC6WBbc8aulGwaVlaXzC48XHY3DoZ6FnoXctvndgSG8M3CY8m0Wdfs5PrMj+ wW5ZI+TVtOFxOmIEQayasbUFZUKDs+kYVwdaSY0eaSqv33hxy55JJzZO2v6z8Vfxclhy usVo0Vtziyr4Y8oBufCzP7UpTj3mRnZBywhSYLP04QDr+ilYa4F9WzahhgX9HnUSXW5b yGW68vIJqYpA/RLFrsvVWjkxizf/zFH2gBMXppKaDwstfC6UYSXnYzFZE43P7wzpl/k8 7N7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bnEv+X6GYtFiiNujnuspHfQ9vua0ZpFbOcCMm5XfbHM=; b=mJSPfBp534AbUzaTbTKRe6xcKXEFU+uIxlhQ67aA6FF7kQ/0OuLhcsKB+fyWE/8+t3 I6GmhWuV/UlvW9cRLs+Y6Ce/kjLiE29PTN64AVds3FnttvpP7+dmkte4SRSeyeH5KjYL C+nI2RGAVJoRgO9uP3txfl6Zv5sl1zS3KyT4aEdQNZPWi5u4RRN1GlYFSF53BSMCjEr0 cSMofM1D0m5qYShcSnlhjqU7RUiYVsXLO5MjRaex4rXdL4LywuXSjxLCgoxwycW/lm2a VOztCzSR5zvF3mNSjgtBW9vd+Uq3+GyNiZyLsSJTK3VRNQeNSHl5m3w5B2jzWKfd7SCQ hKTA== X-Gm-Message-State: APjAAAWXqR4+Jed5+BvExJJUNoYZGTx3MkDR8PPBzTqZ1lD4XexKnSIq SDE9JAhWau+XQ7i+9gOJkyc= X-Google-Smtp-Source: APXvYqzWa8U43FknwvBkH/7zwA5fqURcefu1/ahKJcv8NB+c8DMHbs/Cdwiqmx9Ug400ur6gkg0Qcw== X-Received: by 2002:a17:902:8a8a:: with SMTP id p10mr14126514plo.88.1561753744049; Fri, 28 Jun 2019 13:29:04 -0700 (PDT) Received: from [172.20.54.151] ([2620:10d:c090:200::e695]) by smtp.gmail.com with ESMTPSA id r15sm4509802pfc.162.2019.06.28.13.29.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 28 Jun 2019 13:29:03 -0700 (PDT) From: "Jonathan Lemon" To: "Laatz, Kevin" Cc: "Jakub Kicinski" , netdev@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, bjorn.topel@intel.com, magnus.karlsson@intel.com, bpf@vger.kernel.org, intel-wired-lan@lists.osuosl.org, bruce.richardson@intel.com, ciara.loftus@intel.com Subject: Re: [PATCH 00/11] XDP unaligned chunk placement support Date: Fri, 28 Jun 2019 13:29:02 -0700 X-Mailer: MailMate (1.12.5r5635) Message-ID: In-Reply-To: References: <20190620083924.1996-1-kevin.laatz@intel.com> <20190627142534.4f4b8995@cakuba.netronome.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed; markup=markdown Content-Transfer-Encoding: 8bit Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 28 Jun 2019, at 9:19, Laatz, Kevin wrote: > On 27/06/2019 22:25, Jakub Kicinski wrote: >> On Thu, 27 Jun 2019 12:14:50 +0100, Laatz, Kevin wrote: >>> On the application side (xdpsock), we don't have to worry about the >>> user >>> defined headroom, since it is 0, so we only need to account for the >>> XDP_PACKET_HEADROOM when computing the original address (in the >>> default >>> scenario). >> That assumes specific layout for the data inside the buffer. Some >> NICs >> will prepend information like timestamp to the packet, meaning the >> packet would start at offset XDP_PACKET_HEADROOM + metadata len.. > > Yes, if NICs prepend extra data to the packet that would be a problem > for > using this feature in isolation. However, if we also add in support > for in-order > RX and TX rings, that would no longer be an issue. However, even for > NICs > which do prepend data, this patchset should not break anything that is > currently > working. I read this as "the correct buffer address is recovered from the shadow ring". I'm not sure I'm comfortable with that, and I'm also not sold on in-order completion for the RX/TX rings. >> I think that's very limiting. What is the challenge in providing >> aligned addresses, exactly? > The challenges are two-fold: > 1) it prevents using arbitrary buffer sizes, which will be an issue > supporting e.g. jumbo frames in future. > 2) higher level user-space frameworks which may want to use AF_XDP, > such as DPDK, do not currently support having buffers with 'fixed' > alignment. >     The reason that DPDK uses arbitrary placement is that: >         - it would stop things working on certain NICs which > need the actual writable space specified in units of 1k - therefore we > need 2k + metadata space. >         - we place padding between buffers to avoid constantly > hitting the same memory channels when accessing memory. >         - it allows the application to choose the actual buffer > size it wants to use. >     We make use of the above to allow us to speed up processing > significantly and also reduce the packet buffer memory size. > >     Not having arbitrary buffer alignment also means an AF_XDP > driver for DPDK cannot be a drop-in replacement for existing drivers > in those frameworks. Even with a new capability to allow an arbitrary > buffer alignment, existing apps will need to be modified to use that > new capability. Since all buffers in the umem are the same chunk size, the original buffer address can be recalculated with some multiply/shift math. However, this is more expensive than just a mask operation. -- Jonathan