From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,PDS_BAD_THREAD_QP_64,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C78FC48BE6 for ; Wed, 16 Jun 2021 13:29:36 +0000 (UTC) Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by mail.kernel.org (Postfix) with ESMTP id F09E0611EE for ; Wed, 16 Jun 2021 13:29:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F09E0611EE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=dev-bounces@dpdk.org Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 20E174067A; Wed, 16 Jun 2021 15:29:35 +0200 (CEST) Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by mails.dpdk.org (Postfix) with ESMTP id 0DDDE40140 for ; Wed, 16 Jun 2021 15:29:33 +0200 (CEST) IronPort-SDR: CFLHoQf1bOviEwB7RUbwdGGU/K0Ib80A5yBRWOHWeCKBl1mBZGLBe/2Z7c9O4KQElN6NhjoZxQ pni7Q1Ee5wfg== X-IronPort-AV: E=McAfee;i="6200,9189,10016"; a="206211857" X-IronPort-AV: E=Sophos;i="5.83,278,1616482800"; d="scan'208";a="206211857" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Jun 2021 06:29:32 -0700 IronPort-SDR: Wm0jbkz0n6TkaBJNXLmVuZqYrKfSMGss483Agx2V91OSrBMUQrYv/+JsGKpFx8QZ36T0yPetek sSugnkUVGiFA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,278,1616482800"; d="scan'208";a="554806944" Received: from fmsmsx602.amr.corp.intel.com ([10.18.126.82]) by fmsmga001.fm.intel.com with ESMTP; 16 Jun 2021 06:29:32 -0700 Received: from shsmsx605.ccr.corp.intel.com (10.109.6.215) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.4; Wed, 16 Jun 2021 06:29:31 -0700 Received: from shsmsx601.ccr.corp.intel.com (10.109.6.141) by SHSMSX605.ccr.corp.intel.com (10.109.6.215) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2242.4; Wed, 16 Jun 2021 21:29:24 +0800 Received: from shsmsx601.ccr.corp.intel.com ([10.109.6.141]) by SHSMSX601.ccr.corp.intel.com ([10.109.6.141]) with mapi id 15.01.2242.008; Wed, 16 Jun 2021 21:29:24 +0800 From: "Zhang, Qi Z" To: Honnappa Nagarahalli , Joyce Kong , "Xing, Beilei" , Ruifeng Wang CC: "dev@dpdk.org" , nd , nd Thread-Topic: [PATCH v1] net/i40e: remove the SMP barrier in HW scanning func Thread-Index: AQHXWRQSr5j0Y9t8tEik4BdoB++ri6sHBN4w///HsICAAdTTMP//8JYAgA4bqvA= Date: Wed, 16 Jun 2021 13:29:24 +0000 Message-ID: <12226b6e56ad4c11845242031c9505d9@intel.com> References: <20210604073405.14880-1-joyce.kong@arm.com> <561469a10f13450bae9e857f186b0123@intel.com> <2cb94e2e1bf74840acaadc389b4745f5@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-reaction: no-action dlp-version: 11.5.1.3 dlp-product: dlpe-windows x-originating-ip: [10.239.127.36] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v1] net/i40e: remove the SMP barrier in HW scanning func X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi > -----Original Message----- > From: Honnappa Nagarahalli > Sent: Tuesday, June 8, 2021 5:36 AM > To: Zhang, Qi Z ; Joyce Kong ; > Xing, Beilei ; Ruifeng Wang > Cc: dev@dpdk.org; nd ; Honnappa Nagarahalli > ; nd > Subject: RE: [PATCH v1] net/i40e: remove the SMP barrier in HW scanning > func >=20 > >=20 > > > > > > > > > > > > > Add the logic to determine how many DD bits have been set for > > > > > contiguous packets, for removing the SMP barrier while reading de= scs. > > > > > > > > I didn't understand this. > > > > The current logic already guarantee the read out DD bits are from > > > > continue packets, as it read Rx descriptor in a reversed order > > > > from the > > ring. > > > Qi, the comments in the code mention that there is a race condition > > > if the descriptors are not read in the reverse order. But, they do > > > not mention what the race condition is and how it can occur. > > > Appreciate if you could explain that. > > > > The Race condition happens between the NIC and CPU, if write and read > > DD bit in the same order, there might be a hole (e.g. 1011) with the > > reverse read order, we make sure no more "1" after the first "0" > > as the read address are declared as volatile, compiler will not > > re-ordered them. > My understanding is that >=20 > 1) the NIC will write an entire cache line of descriptors to memory "atom= ically" > (i.e. the entire cache line is visible to the CPU at once) if there are e= nough > descriptors ready to fill one cache line. > 2) But, if there are not enough descriptors ready (because for ex: there = is not > enough traffic), then it might write partial cache lines. Yes, for example a cache line contains 4 x16 bytes descriptors and it is po= ssible we get 1 1 1 0 for DD bit at some moment. >=20 > Please correct me if I am wrong. >=20 > For #1, I do not think it matters if we read the descriptors in reverse o= rder or > not as the cache line is written atomically. I think below cases may happens if we don't read in reserve order. 1. CPU get first cache line as 1 1 1 0 in a loop 2. new packets coming and NIC append last 1 to the first cache and a new ca= che line with 1 1 1 1. 3. CPU continue new cache line with 1 1 1 1 in the same loop, but the last = 1 of first cache line is missed, so finally it get 1 1 1 0 1 1 1 1.=20 > For #1, if we read in reverse order, does it make sense to not check the = DD bits > of descriptors that are earlier in the order once we encounter a descript= or that > has its DD bit set? This is because NIC updates the descriptors in order. I think the answer is yes, when we met the first DD bit, we should able to = calculated the exact number base on the index, but not sure how much perfor= mance gain. >=20 > > > > > > > > On x86, the reads are not re-ordered (though the compiler can > > > re-order). On ARM, the reads can get re-ordered and hence the > > > barriers are required. In order to avoid the barriers, we are trying > > > to process only those descriptors whose DD bits are set such that > > > they are contiguous. i.e. if the DD bits are 1011, we process only th= e first > descriptor. > > > > Ok, I see. thanks for the explanation. > > At this moment, I may prefer not change the behavior of x86, so > > compile option for arm can be added, in future when we observe no > > performance impact for x86 as well, we can consider to remove it, what = do > you think? > I am ok with this approach. >=20 > > > > > > > > > So I didn't see the a new logic be added, would you describe more > > > > clear about the purpose of this patch? > > > > > > > > > > > > > > Signed-off-by: Joyce Kong > > > > > Reviewed-by: Ruifeng Wang > > > > > --- > > > > > drivers/net/i40e/i40e_rxtx.c | 13 ++++++++----- > > > > > 1 file changed, 8 insertions(+), 5 deletions(-) > > > > > > > > > > diff --git a/drivers/net/i40e/i40e_rxtx.c > > > > > b/drivers/net/i40e/i40e_rxtx.c index > > > > > 6c58decec..410a81f30 100644 > > > > > --- a/drivers/net/i40e/i40e_rxtx.c > > > > > +++ b/drivers/net/i40e/i40e_rxtx.c > > > > > @@ -452,7 +452,7 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue > > *rxq) > > > > > uint16_t pkt_len; > > > > > uint64_t qword1; > > > > > uint32_t rx_status; > > > > > - int32_t s[I40E_LOOK_AHEAD], nb_dd; > > > > > + int32_t s[I40E_LOOK_AHEAD], var, nb_dd; > > > > > int32_t i, j, nb_rx =3D 0; > > > > > uint64_t pkt_flags; > > > > > uint32_t *ptype_tbl =3D rxq->vsi->adapter->ptype_tbl; @@ -482,1= 1 > > > > > +482,14 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq) > > > > > I40E_RXD_QW1_STATUS_SHIFT; > > > > > } > > > > > > > > > > - rte_smp_rmb(); > > > > > > > > Any performance gain by removing this? and it is not necessary to > > > > be combined with below change, right? > > > > > > > > > - > > > > > /* Compute how many status bits were set */ > > > > > - for (j =3D 0, nb_dd =3D 0; j < I40E_LOOK_AHEAD; j++) > > > > > - nb_dd +=3D s[j] & (1 << > > > > I40E_RX_DESC_STATUS_DD_SHIFT); > > > > > + for (j =3D 0, nb_dd =3D 0; j < I40E_LOOK_AHEAD; j++) { > > > > > + var =3D s[j] & (1 << I40E_RX_DESC_STATUS_DD_SHIFT); > > > > > + if (var) > > > > > + nb_dd +=3D 1; > > > > > + else > > > > > + break; > > > > > + } > > > > > > > > > > nb_rx +=3D nb_dd; > > > > > > > > > > -- > > > > > 2.17.1