From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4729EC433DF for ; Sun, 11 Oct 2020 22:17:54 +0000 (UTC) Received: from dpdk.org (dpdk.org [92.243.14.124]) by mail.kernel.org (Postfix) with ESMTP id 5905F20735 for ; Sun, 11 Oct 2020 22:17:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5905F20735 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=monjalon.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=dev-bounces@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 37DDD1D58E; Mon, 12 Oct 2020 00:17:51 +0200 (CEST) Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by dpdk.org (Postfix) with ESMTP id AD2741D58D for ; Mon, 12 Oct 2020 00:17:48 +0200 (CEST) Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.west.internal (Postfix) with ESMTP id 54C43F5C; Sun, 11 Oct 2020 18:17:46 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Sun, 11 Oct 2020 18:17:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=fm2; bh= zWEt8z0AhjtqoAh6M9iGQqszh3v2Qld7MuVGfjy/HWY=; b=kbA0EiDB1yWfjYZZ CGSAiW8ib7Xz0CaeGDKxzyVoTGVk1tYvDwikUcCYiG+9EeE4BafZqtUknecc4nbi PdQsUw4oDV3XSyjtiORWEQXvgm5cWcl2Z8a40nmRpEYIKvvJkH42ahl9WdPqCm5C fks0UJqw10Z6+sr6KceZA2Yg+A7UbVadbFMfZGfIcQB57DBvadl5o4288Me2k4W1 ROyGhd4XVz2XTWbaPvIhHTSyCkCWHusVhY2TXmNbtx8KkoMbDTzjdbzfA8EZc9Ql MiMaopxxuJsWV9UoGgdQ6h1WEWxHxuO2ed97RttYvNy5C92YhBlB0Er4NAIAPAVS 5Ed8CA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; bh=zWEt8z0AhjtqoAh6M9iGQqszh3v2Qld7MuVGfjy/H WY=; b=GmiGEbzr0m+CGoZpa9uC4uzHzGhdSkgbCl2A14q3dicD2TjJMWSusPm+9 m3ksXYXbageWBtcZkz4zsu7iZePEujA+cXDaiOV2JVEwtoMyDNE+pYsGN3Ia+XIQ y1g6UesU5w5H/+beTJtD3JEb95jFob+G6A60z3LwQo422TfhorGe3ZYJ4stDMzvB U+9HGegJETTMlsokOVvatSRrr3WZGaA4ZR/OGijerg6cuu53Mx5cDUj7K05WDVgv 8mRlXmupt8lmhVpGkeqnZzpUsnXSgv3D72q5tJ0od+p9jPKV0todRkHLeaGTe7BR zjNfEU7aDGlDZ8pXpIwVve5woWctQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrheeigddtlecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkjghfggfgtgesthfuredttddtvdenucfhrhhomhepvfhhohhmrghs ucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenucggtf frrghtthgvrhhnpedugefgvdefudfftdefgeelgffhueekgfffhfeujedtteeutdejueei iedvffegheenucfkphepjeejrddufeegrddvtdefrddukeegnecuvehluhhsthgvrhfuih iivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepthhhohhmrghssehmohhnjhgrlhho nhdrnhgvth X-ME-Proxy: Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184]) by mail.messagingengine.com (Postfix) with ESMTPA id A59FE328005D; Sun, 11 Oct 2020 18:17:43 -0400 (EDT) From: Thomas Monjalon To: Viacheslav Ovsiienko Cc: dev@dpdk.org, stephen@networkplumber.org, ferruh.yigit@intel.com, olivier.matz@6wind.com, jerinjacobk@gmail.com, maxime.coquelin@redhat.com, david.marchand@redhat.com, arybchenko@solarflare.com Date: Mon, 12 Oct 2020 00:17:42 +0200 Message-ID: <5451712.xXqzTqBGid@thomas> In-Reply-To: <1602083215-22921-2-git-send-email-viacheslavo@nvidia.com> References: <1602083215-22921-1-git-send-email-viacheslavo@nvidia.com> <1602083215-22921-2-git-send-email-viacheslavo@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Subject: Re: [dpdk-dev] [PATCH v2 1/9] ethdev: introduce Rx buffer split X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" 07/10/2020 17:06, Viacheslav Ovsiienko: > The DPDK datapath in the transmit direction is very flexible. > An application can build the multi-segment packet and manages > almost all data aspects - the memory pools where segments > are allocated from, the segment lengths, the memory attributes > like external buffers, registered for DMA, etc. > > In the receiving direction, the datapath is much less flexible, > an application can only specify the memory pool to configure the > receiving queue and nothing more. In order to extend receiving > datapath capabilities it is proposed to add the way to provide > extended information how to split the packets being received. > > The following structure is introduced to specify the Rx packet > segment: > > struct rte_eth_rxseg { > struct rte_mempool *mp; /* memory pools to allocate segment from */ > uint16_t length; /* segment maximal data length */ The "length" parameter is configuring a split point. Worth to note in the comment I think. > uint16_t offset; /* data offset from beginning of mbuf data buffer */ Is it replacing RTE_PKTMBUF_HEADROOM? > uint32_t reserved; /* reserved field */ > }; > > The new routine rte_eth_rx_queue_setup_ex() is introduced to > setup the given Rx queue using the new extended Rx packet segment > description: > > int > rte_eth_rx_queue_setup_ex(uint16_t port_id, uint16_t rx_queue_id, > uint16_t nb_rx_desc, unsigned int socket_id, > const struct rte_eth_rxconf *rx_conf, > const struct rte_eth_rxseg *rx_seg, > uint16_t n_seg) An alternative name for this function: rte_eth_rxseg_queue_setup > This routine presents the two new parameters: > rx_seg - pointer the array of segment descriptions, each element > describes the memory pool, maximal data length, initial > data offset from the beginning of data buffer in mbuf > n_seg - number of elements in the array Not clear why we need an array. I suggest writing here that each segment of the same packet can have different properties, the array representing the full packet. > The new offload flag DEV_RX_OFFLOAD_BUFFER_SPLIT in device The name should start with RTE_ prefix. > capabilities is introduced to present the way for PMD to report to > application about supporting Rx packet split to configurable > segments. Prior invoking the rte_eth_rx_queue_setup_ex() routine > application should check DEV_RX_OFFLOAD_BUFFER_SPLIT flag. > > If the Rx queue is configured with new routine the packets being > received will be split into multiple segments pushed to the mbufs > with specified attributes. The PMD will allocate the first mbuf > from the pool specified in the first segment descriptor and puts > the data staring at specified offset in the allocated mbuf data > buffer. If packet length exceeds the specified segment length > the next mbuf will be allocated according to the next segment > descriptor (if any) and data will be put in its data buffer at > specified offset and not exceeding specified length. If there is > no next descriptor the next mbuf will be allocated and filled in the > same way (from the same pool and with the same buffer offset/length) > as the current one. > > For example, let's suppose we configured the Rx queue with the > following segments: > seg0 - pool0, len0=14B, off0=RTE_PKTMBUF_HEADROOM > seg1 - pool1, len1=20B, off1=0B > seg2 - pool2, len2=20B, off2=0B > seg3 - pool3, len3=512B, off3=0B > > The packet 46 bytes long will look like the following: > seg0 - 14B long @ RTE_PKTMBUF_HEADROOM in mbuf from pool0 > seg1 - 20B long @ 0 in mbuf from pool1 > seg2 - 12B long @ 0 in mbuf from pool2 > > The packet 1500 bytes long will look like the following: > seg0 - 14B @ RTE_PKTMBUF_HEADROOM in mbuf from pool0 > seg1 - 20B @ 0 in mbuf from pool1 > seg2 - 20B @ 0 in mbuf from pool2 > seg3 - 512B @ 0 in mbuf from pool3 > seg4 - 512B @ 0 in mbuf from pool3 > seg5 - 422B @ 0 in mbuf from pool3 > > The offload DEV_RX_OFFLOAD_SCATTER must be present and > configured to support new buffer split feature (if n_seg > is greater than one). > > The new approach would allow splitting the ingress packets into > multiple parts pushed to the memory with different attributes. > For example, the packet headers can be pushed to the embedded > data buffers within mbufs and the application data into > the external buffers attached to mbufs allocated from the > different memory pools. The memory attributes for the split > parts may differ either - for example the application data > may be pushed into the external memory located on the dedicated > physical device, say GPU or NVMe. This would improve the DPDK > receiving datapath flexibility with preserving compatibility > with existing API. > > Also, the proposed segment description might be used to specify > Rx packet split for some other features. For example, provide > the way to specify the extra memory pool for the Header Split > feature of some Intel PMD. I don't understand what you are referring in this last paragraph. I think explanation above is enough to demonstrate the flexibility. > Signed-off-by: Viacheslav Ovsiienko Thank you, I like this feature. More minor comments below. [...] > +* **Introduced extended buffer description for receiving.** Rewording: Introduced extended setup of Rx queue > + * Added extended Rx queue setup routine > + * Added description for Rx segment sizes not only "sizes", but also offset and mempool. > + * Added capability to specify the memory pool for each segment This one can be merged with the above, or offset should be added. [...] The doxygen comment is missing here. > +int rte_eth_rx_queue_setup_ex(uint16_t port_id, uint16_t rx_queue_id, > + uint16_t nb_rx_desc, unsigned int socket_id, > + const struct rte_eth_rxconf *rx_conf, > + const struct rte_eth_rxseg *rx_seg, uint16_t n_seg); This new function should be experimental and it should be added to the .map file.