From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3F23C433EF for ; Thu, 27 Jan 2022 17:13:41 +0000 (UTC) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B540C427AC; Thu, 27 Jan 2022 18:13:40 +0100 (CET) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id 615554067C for ; Thu, 27 Jan 2022 18:13:39 +0100 (CET) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [RFC PATCH v1 0/4] Direct re-arming of buffers on receive side X-MimeOLE: Produced By Microsoft Exchange V6.5 Date: Thu, 27 Jan 2022 18:13:38 +0100 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D86E50@smartserver.smartshare.dk> In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [RFC PATCH v1 0/4] Direct re-arming of buffers on receive side Thread-Index: Adf45cqRWsm3zOPbTBqjsnkimDKFQgBU+6HQBj0VyOAAHBU7sA== References: <20211224164613.32569-1-feifei.wang2@arm.com> <98CBD80474FA8B44BF855DF32C47DC35D86DAF@smartserver.smartshare.dk> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Honnappa Nagarahalli" , "Ananyev, Konstantin" Cc: , "nd" , , "Feifei Wang" , "Yigit, Ferruh" , "Andrew Rybchenko" , "Zhang, Qi Z" , "Xing, Beilei" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com] > Sent: Thursday, 27 January 2022 05.07 >=20 > Thanks Morten, appreciate your comments. Few responses inline. >=20 > > -----Original Message----- > > From: Morten Br=F8rup > > Sent: Sunday, December 26, 2021 4:25 AM > > > > > From: Feifei Wang [mailto:feifei.wang2@arm.com] > > > Sent: Friday, 24 December 2021 17.46 > > > > >=20 > > > > > > However, this solution poses several constraint: > > > > > > 1)The receive queue needs to know which transmit queue it should > take > > > the buffers from. The application logic decides which transmit = port > to > > > use to send out the packets. In many use cases the NIC might have = a > > > single port ([1], [2], [3]), in which case a given transmit queue > is > > > always mapped to a single receive queue (1:1 Rx queue: Tx queue). > This > > > is easy to configure. > > > > > > If the NIC has 2 ports (there are several references), then we = will > > > have > > > 1:2 (RX queue: TX queue) mapping which is still easy to configure. > > > However, if this is generalized to 'N' ports, the configuration = can > be > > > long. More over the PMD would have to scan a list of transmit > queues > > > to pull the buffers from. > > > > I disagree with the description of this constraint. > > > > As I understand it, it doesn't matter now many ports or queues are = in > a NIC or > > system. > > > > The constraint is more narrow: > > > > This patch requires that all packets ingressing on some port/queue > must > > egress on the specific port/queue that it has been configured to = ream > its > > buffers from. I.e. an application cannot route packets between > multiple ports > > with this patch. > Agree, this patch as is has this constraint. It is not a constraint > that would apply for NICs with single port. The above text is > describing some of the issues associated with generalizing the = solution > for N number of ports. If N is small, the configuration is small and > scanning should not be bad. >=20 Perhaps we can live with the 1:1 limitation, if that is the primary use = case. Alternatively, the feature could fall back to using the mempool if = unable to get/put buffers directly from/to a participating NIC. In this = case, I envision a library serving as a shim layer between the NICs and = the mempool. In other words: Take a step back from the implementation, = and discuss the high level requirements and architecture of the proposed = feature. > > > > > >=20 > >=20 > > > > > > > You are missing the fourth constraint: > > > > 4) The application must transmit all received packets immediately, > i.e. QoS > > queueing and similar is prohibited. > I do not understand this, can you please elaborate?. Even if there is > QoS queuing, there would be steady stream of packets being = transmitted. > These transmitted packets will fill the buffers on the RX side. E.g. an appliance may receive packets on a 10 Gbps backbone port, and = queue some of the packets up for a customer with a 20 Mbit/s = subscription. When there is a large burst of packets towards that = subscriber, they will queue up in the QoS queue dedicated to that = subscriber. During that traffic burst, there is much more RX than TX. = And after the traffic burst, there will be more TX than RX. >=20 > > > >=20 > > > > > > > The patch provides a significant performance improvement, but I am > > wondering if any real world applications exist that would use this. > Only a > > "router on a stick" (i.e. a single-port router) comes to my mind, = and > that is > > probably sufficient to call it useful in the real world. Do you have > any other > > examples to support the usefulness of this patch? > SmartNIC is a clear and dominant use case, typically they have a = single > port for data plane traffic (dual ports are mostly for redundancy) > This patch avoids good amount of store operations. The smaller CPUs > found in SmartNICs have smaller store buffers which can become > bottlenecks. Avoiding the lcore cache saves valuable HW cache space. OK. This is an important use case! >=20 > > > > Anyway, the patch doesn't do any harm if unused, and the only > performance > > cost is the "if (rxq->direct_rxrearm_enable)" branch in the Ethdev > driver. So I > > don't oppose to it. > > >=20