netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Laatz, Kevin" <kevin.laatz@intel.com>
To: Jonathan Lemon <jonathan.lemon@gmail.com>
Cc: netdev@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	bjorn.topel@intel.com, magnus.karlsson@intel.com,
	bpf@vger.kernel.com, intel-wired-lan@lists.osuosl.org,
	bruce.richardson@intel.com, ciara.loftus@intel.com
Subject: Re: [PATCH 00/11] XDP unaligned chunk placement support
Date: Thu, 27 Jun 2019 12:14:50 +0100	[thread overview]
Message-ID: <ef7e9469-e7be-647b-8bb1-da29bc01fa2e@intel.com> (raw)
In-Reply-To: <FA8389B9-F89C-4BFF-95EE-56F702BBCC6D@gmail.com>


On 25/06/2019 19:44, Jonathan Lemon wrote:
> On 20 Jun 2019, at 1:39, Kevin Laatz wrote:
>
>> This patchset adds the ability to use unaligned chunks in the XDP umem.
>>
>> Currently, all chunk addresses passed to the umem are masked to be chunk
>> size aligned (default is 2k, max is PAGE_SIZE). This limits where we can
>> place chunks within the umem as well as limiting the packet sizes 
>> that are
>> supported.
>>
>> The changes in this patchset removes these restrictions, allowing XDP 
>> to be
>> more flexible in where it can place a chunk within a umem. By 
>> relaxing where
>> the chunks can be placed, it allows us to use an arbitrary buffer 
>> size and
>> place that wherever we have a free address in the umem. These changes 
>> add the
>> ability to support jumboframes and make it easy to integrate with other
>> existing frameworks that have their own memory management systems, 
>> such as
>> DPDK.
>
> I'm a little unclear on how this should work, and have a few issues here:
>
>  1) There isn't any support for the user defined umem->headroom
>

For the unaligned chunks case, it does not make sense to to support a 
user defined headroom since the user can point directly to where they 
want the data to start via the buffer address. Therefore, for unaligned 
chunks, the user defined headroom should always be 0 (aka the user did 
not define a headroom and the default value of 0 is used). Any other 
value will be caught and we return an invalid argument error.


>  2) When queuing RX buffers, the handle (aka umem offset) is used, which
>     points to the start of the buffer area.  When the buffer appears in
>     the completion queue, handle points to the start of the received 
> data,
>     which might be different from the buffer start address.
>
>     Normally, this RX address is just put back in the fill queue, and the
>     mask is used to find the buffer start address again.  This no longer
>     works, so my question is, how is the buffer start address recomputed
>     from the actual data payload address?
>
>     Same with TX - if the TX payload isn't aligned in with the start of
>     the buffer, what happens?

On the application side (xdpsock), we don't have to worry about the user 
defined headroom, since it is 0, so we only need to account for the 
XDP_PACKET_HEADROOM when computing the original address (in the default 
scenario). This was missing from the v1, will add this in the v2, to 
have xdpsock use the default value from libbpf! If the user is using 
another BPF program that uses a different offset, then the computation 
will need to be adjusted for that accordingly. In v2 we'll add support 
for this via command-line parameter.

However, we are also working on an "in-order" patchset, hopefully to be 
published soon, to guarantee the buffers returned to the application are 
in the same order as those provided to the kernel. Longer term, this is 
the best solution here as it allows the application to track itself, via 
a "shadow ring" or otherwise, the buffers sent to the kernel and any 
metadata associated with them, such as the start of buffer address.

>
>  3) This appears limited to crossing a single page boundary, but there
>     is no constraint check on chunk_size.

There is an existing check for chunk_size during xdp_umem_reg (in 
xdp_umem.c) The check makes sure that chunk size is at least 
XDP_UMEM_MIN_CHUNK_SIZE and at most PAGE_SIZE. Since the max is page 
size, we only need to check the immediate next page for contiguity.
While this patchset allows a max of 4k sized buffers, it is still an 
improvement from the current state. Future enhancements could look into 
extending the 4k limit but for now it is a good first step towards 
supporting hugepages efficiently.

Best regards,
Kevin

  reply	other threads:[~2019-06-27 11:14 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-20  8:39 [PATCH 00/11] XDP unaligned chunk placement support Kevin Laatz
2019-06-20  8:39 ` [PATCH 01/11] i40e: simplify Rx buffer recycle Kevin Laatz
2019-06-20  8:39 ` [PATCH 02/11] ixgbe: " Kevin Laatz
2019-06-20  8:39 ` [PATCH 03/11] xdp: add offset param to zero_copy_allocator Kevin Laatz
2019-06-20  8:39 ` [PATCH 04/11] i40e: add offset to zca_free Kevin Laatz
2019-06-20  8:39 ` [PATCH 05/11] ixgbe: " Kevin Laatz
2019-06-20  8:39 ` [PATCH 06/11] xsk: add support to allow unaligned chunk placement Kevin Laatz
2019-06-20  8:39 ` [PATCH 07/11] libbpf: add flags to umem config Kevin Laatz
2019-06-20  8:39 ` [PATCH 08/11] samples/bpf: add unaligned chunks mode support to xdpsock Kevin Laatz
2019-06-20  8:39 ` [PATCH 09/11] samples/bpf: add buffer recycling for unaligned chunks " Kevin Laatz
2019-06-20  8:39 ` [PATCH 10/11] samples/bpf: use hugepages in xdpsock app Kevin Laatz
2019-06-20  8:39 ` [PATCH 11/11] doc/af_xdp: include unaligned chunk case Kevin Laatz
2019-06-24 15:38 ` [PATCH 00/11] XDP unaligned chunk placement support Björn Töpel
2019-06-25 13:12   ` Laatz, Kevin
2019-06-25 18:44 ` Jonathan Lemon
2019-06-27 11:14   ` Laatz, Kevin [this message]
2019-06-27 21:25     ` Jakub Kicinski
2019-06-28 16:19       ` Laatz, Kevin
2019-06-28 16:51         ` Björn Töpel
2019-06-28 20:08           ` Jakub Kicinski
2019-06-28 20:25         ` Jakub Kicinski
2019-06-28 20:29         ` Jonathan Lemon
2019-07-01 14:58           ` Laatz, Kevin
     [not found]           ` <07e404eb-f712-b15a-4884-315aff3f7c7d@intel.com>
2019-07-01 21:20             ` Jakub Kicinski
2019-07-02  9:27               ` Richardson, Bruce
2019-07-02 16:33                 ` Jonathan Lemon
2019-06-20  9:09 Kevin Laatz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ef7e9469-e7be-647b-8bb1-da29bc01fa2e@intel.com \
    --to=kevin.laatz@intel.com \
    --cc=ast@kernel.org \
    --cc=bjorn.topel@intel.com \
    --cc=bpf@vger.kernel.com \
    --cc=bruce.richardson@intel.com \
    --cc=ciara.loftus@intel.com \
    --cc=daniel@iogearbox.net \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jonathan.lemon@gmail.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).