From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dong Wang Subject: Re: [PATCH] ixgbe:Add write memory barrier for recv pkts. Date: Tue, 5 May 2015 23:52:34 +0800 Message-ID: References: <2601191342CEEE43887BDE71AB97725821415A3A@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725821415E37@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772582141626B@irsmsx105.ger.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit To: "Ananyev, Konstantin" , "dev-VfR2kkLFssw@public.gmane.org" Return-path: In-Reply-To: <2601191342CEEE43887BDE71AB9772582141626B-pww93C2UFcwu0RiL9chJVbfspsVTdybXVpNB7YpNyf8@public.gmane.org> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces-VfR2kkLFssw@public.gmane.org Sender: "dev" Hi, Konstantin, > > >> -----Original Message----- >> From: outlook_739db8e1c4bc6fae-1ViLX0X+lBJBDgjK7y7TUQ@public.gmane.org [mailto:outlook_739db8e1c4bc6fae-1ViLX0X+lBJBDgjK7y7TUQ@public.gmane.org] On Behalf Of Wang Dong >> Sent: Thursday, April 16, 2015 12:36 PM >> To: Ananyev, Konstantin; dev-VfR2kkLFssw@public.gmane.org >> Subject: Re: [dpdk-dev] [PATCH] ixgbe:Add write memory barrier for recv pkts. >> >>> >>> >>>> -----Original Message----- >>>> From: outlook_739db8e1c4bc6fae-1ViLX0X+lBJBDgjK7y7TUQ@public.gmane.org [mailto:outlook_739db8e1c4bc6fae-1ViLX0X+lBJBDgjK7y7TUQ@public.gmane.org] On Behalf Of Dong.Wang >>>> Sent: Wednesday, April 15, 2015 2:46 PM >>>> To: Ananyev, Konstantin; dev-VfR2kkLFssw@public.gmane.org >>>> Subject: Re: [dpdk-dev] [PATCH] ixgbe:Add write memory barrier for recv pkts. >>>> >>>> >>>> >>>>> Hi, >>>>> >>>>>> -----Original Message----- >>>>>> From: dev [mailto:dev-bounces-VfR2kkLFssw@public.gmane.org] On Behalf Of WangDong >>>>>> Sent: Saturday, April 11, 2015 4:34 PM >>>>>> To: dev-VfR2kkLFssw@public.gmane.org >>>>>> Subject: [dpdk-dev] [PATCH] ixgbe:Add write memory barrier for recv pkts. >>>>>> >>>>>> Like transmit packets, before update receive descriptor's tail pointer, rte_wmb() should be added after writing recv descriptor. >>>>>> >>>>>> Signed-off-by: Dong Wang >>>>>> --- >>>>>> lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 5 +++++ >>>>>> 1 file changed, 5 insertions(+) >>>>>> >>>>>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c >>>>>> index 9da2c7e..d504688 100644 >>>>>> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c >>>>>> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c >>>>>> @@ -1338,6 +1338,9 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, >>>>>> */ >>>>>> rx_pkts[nb_rx++] = rxm; >>>>>> } >>>>>> + >>>>>> + rte_wmb(); >>>>>> + >>>>> >>>>> Why do you think it is necessary? >>>>> I can't see any good reason to put wmb() here. >>>>> I would understand if, at least you'll try to insert it just before updating RDT: >>>>> rx_id = (uint16_t) ((rx_id == 0) ? >>>>> (rxq->nb_rx_desc - 1) : (rx_id - 1)); >>>>> + rte_wmb(); >>>>> IXGBE_PCI_REG_WRITE(rxq->rdt_reg_addr, rx_id); >>>>> >>>>> That is not needed IA with current implementation, but would make sense for machines with relaxed memory ordering. >>>>> Though right now DPDK IXGBE PMD is supported only on IA, anyway. >>>>> Same for ixgbe_recv_scattered_pkts(). >>>>> >>>>> Konstantin >>>> >>>> Yes, current implementation works well with IA, and the transmit packets >>>> function's rte_wmb() is also unneccessary. >>>> >>>> But there are two reasons for adding rte_wmb() in recv pkts function: >>>> 1) The memory barrier in recv pkts function and xmit pkts function are >>>> inconsistent, rte_wmb() should be added to recv pkts function or be >>>> removed from xmit pkts function. >>>> 2) DPDK will support PowerPC processor (Other developers are working on >>>> it), I check the memory ordering of PowerPC, there was no mention of >>>> store-store instruction's principle in MPC8544 Reference Manual, only >>>> said it is weak memory ordering. >>>> >>>> So, I think it is neccessary to add rte_wmb() to recv pkts function. >>>> >>>> Dong >>> >>> What I was trying to say: >>> >>> 1. I think you put barrier in a wrong place. >>> Even for machines with weak memory ordering, we need a barrier only when we are goint to update RDT, i.e: >>> if (nb_hold > rxq->rx_free_thresh) { ... ; barrier; IXGBE_PCI_REG_WRITE(rxq->rdt_reg_addr, ...); } >> Yes, I put it in a wrong place, it will reduce performance. It's better >> to place it in that you suggested. >>> >>> 2. Even with putting wmb() here, you wouldn't fix ixgbe_recv_pkts() to work on machines with weak memory ordering. >>> I think that to make it work properly, you'll need an rmb() bewtween reading DD bit and rest of RXD: >>> >>> rxdp = &rx_ring[rx_id]; >>> staterr = rxdp->wb.upper.status_error; >>> + rte_rmb(); >>> if (! (staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))) >>> break; >>> rxd = *rxdp; >> Yes, it seems wmb is not enough for weak memory ordering processor. Both >> rmb and wmb are needed. >>> >>> 3. As Stephen pointed in his mail, we shouldn't penalise IA implementation with unnecessary barriers >>> As was discussed at that thread: http://dpdk.org/ml/archives/dev/2015-March/015202.html >>> probably the best is to introduce a new macros: rte_smp_*mb (or something) that would be architecture dependent: >>> compiler_barrier on IA, proper HW barrier on machines with weak memory ordering and update the code to use it. I was trying to add a new macro, but I found it didn't need a new memory barrier macro, may be a macro that can distinguish the memory barrier (rte_wmb/rte_rmb) of IA and AMD is useful. Other architecture still use the rte_wmb() and rte_rmb(). I send a patch about it, please take a look at it...... Dong >>> >>> So, if you like to fix that issue, please do that in a proper way. >>> >>> BTW, I think that for PPC support even before touching ixgbe or any other PMD, >>> step 3 (or similar) need to be done on rte_ring enqueue/dequeue code. >>> >>> Konstantin >> Yes, a new set of macros should be introduced first, then we can update >> PMD code. Did anyone are working on it now ? > > As far as I know, no one is working on it right now. > So, I suppose, you are welcome to start :) > Konstantin > >> >> Dong >>> >>>>> >>>>> >>>>>> rxq->rx_tail = rx_id; >>>>>> >>>>>> /* >>>>>> @@ -1595,6 +1598,8 @@ ixgbe_recv_scattered_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, >>>>>> first_seg = NULL; >>>>>> } >>>>>> >>>>>> + rte_wmb(); >>>>>> + >>>>>> /* >>>>>> * Record index of the next RX descriptor to probe. >>>>>> */ >>>>>> -- >>>>>> 1.9.1 >>>>>