From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dong Wang Subject: Re: [PATCH] ixgbe:Add write memory barrier for recv pkts. Date: Thu, 16 Apr 2015 23:55:21 +0800 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable To: "Ananyev, Konstantin" , Return-path: List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces-VfR2kkLFssw@public.gmane.org Sender: "dev" OK=2C let me start it~~~ I'll send another patch after several days. Dong --- Original Message --- From: "Ananyev=2C Konstantin" Sent: April 16=2C 2015 11:14 PM To: "Wang Dong" =2C dev-VfR2kkLFssw@public.gmane.org Subject: RE: [dpdk-dev] [PATCH] ixgbe:Add write memory barrier for recv pkt= s. > -----Original Message----- > From: outlook_739db8e1c4bc6fae-1ViLX0X+lBJBDgjK7y7TUQ@public.gmane.org [mailto:outlook_739db8e1c4bc6f= ae-1ViLX0X+lBJBDgjK7y7TUQ@public.gmane.org] On Behalf Of Wang Dong > Sent: Thursday=2C April 16=2C 2015 12:36 PM > To: Ananyev=2C Konstantin=3B dev-VfR2kkLFssw@public.gmane.org > Subject: Re: [dpdk-dev] [PATCH] ixgbe:Add write memory barrier for recv p= kts. > > > > > > >> -----Original Message----- > >> From: outlook_739db8e1c4bc6fae-1ViLX0X+lBJBDgjK7y7TUQ@public.gmane.org [mailto:outlook_739db8e1c4b= c6fae-1ViLX0X+lBJBDgjK7y7TUQ@public.gmane.org] On Behalf Of Dong.Wang > >> Sent: Wednesday=2C April 15=2C 2015 2:46 PM > >> To: Ananyev=2C Konstantin=3B dev-VfR2kkLFssw@public.gmane.org > >> Subject: Re: [dpdk-dev] [PATCH] ixgbe:Add write memory barrier for rec= v pkts. > >> > >> > >> > >>> Hi=2C > >>> > >>>> -----Original Message----- > >>>> From: dev [mailto:dev-bounces-VfR2kkLFssw@public.gmane.org] On Behalf Of WangDong > >>>> Sent: Saturday=2C April 11=2C 2015 4:34 PM > >>>> To: dev-VfR2kkLFssw@public.gmane.org > >>>> Subject: [dpdk-dev] [PATCH] ixgbe:Add write memory barrier for recv = pkts. > >>>> > >>>> Like transmit packets=2C before update receive descriptor's tail poi= nter=2C rte_wmb() should be added after writing recv descriptor. > >>>> > >>>> Signed-off-by: Dong Wang > >>>> --- > >>>> lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 5 +++++ > >>>> 1 file changed=2C 5 insertions(+) > >>>> > >>>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgb= e/ixgbe_rxtx.c > >>>> index 9da2c7e..d504688 100644 > >>>> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > >>>> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > >>>> @@ -1338=2C6 +1338=2C9 @@ ixgbe_recv_pkts(void *rx_queue=2C struct r= te_mbuf **rx_pkts=2C > >>>> */ > >>>> rx_pkts[nb_rx++] =3D rxm=3B > >>>> } > >>>> + > >>>> + rte_wmb()=3B > >>>> + > >>> > >>> Why do you think it is necessary? > >>> I can't see any good reason to put wmb() here. > >>> I would understand if=2C at least you'll try to insert it just before= updating RDT: > >>> rx_id =3D (uint16_t) ((rx_id =3D=3D 0) ? > >>> (rxq->nb_rx_desc - 1) : (rx_id= - 1))=3B > >>> + rte_wmb()=3B > >>> IXGBE_PCI_REG_WRITE(rxq->rdt_reg_addr=2C rx_id)=3B > >>> > >>> That is not needed IA with current implementation=2C but would make s= ense for machines with relaxed memory ordering. > >>> Though right now DPDK IXGBE PMD is supported only on IA=2C anyway. > >>> Same for ixgbe_recv_scattered_pkts(). > >>> > >>> Konstantin > >> > >> Yes=2C current implementation works well with IA=2C and the transmit p= ackets > >> function's rte_wmb() is also unneccessary. > >> > >> But there are two reasons for adding rte_wmb() in recv pkts function: > >> 1) The memory barrier in recv pkts function and xmit pkts function are > >> inconsistent=2C rte_wmb() should be added to recv pkts function or be > >> removed from xmit pkts function. > >> 2) DPDK will support PowerPC processor (Other developers are working o= n > >> it)=2C I check the memory ordering of PowerPC=2C there was no mention = of > >> store-store instruction's principle in MPC8544 Reference Manual=2C onl= y > >> said it is weak memory ordering. > >> > >> So=2C I think it is neccessary to add rte_wmb() to recv pkts function. > >> > >> Dong > > > > What I was trying to say: > > > > 1. I think you put barrier in a wrong place. > > Even for machines with weak memory ordering=2C we need a barrier only w= hen we are goint to update RDT=2C i.e: > > if (nb_hold > rxq->rx_free_thresh) { ... =3B barrier=3B IXGBE_PCI_REG_W= RITE(rxq->rdt_reg_addr=2C ...)=3B } > Yes=2C I put it in a wrong place=2C it will reduce performance. It's bett= er > to place it in that you suggested. > > > > 2. Even with putting wmb() here=2C you wouldn't fix ixgbe_recv_pkts() = to work on machines with weak memory ordering. > > I think that to make it work properly=2C you'll need an rmb() bewtween = reading DD bit and rest of RXD: > > > > rxdp =3D &rx_ring[rx_id]=3B > > staterr =3D rxdp->wb.upper.status_error=3B > > + rte_rmb()=3B > > if (! (staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))) > > break=3B > > rxd =3D *rxdp=3B > Yes=2C it seems wmb is not enough for weak memory ordering processor. Bot= h > rmb and wmb are needed. > > > > 3. As Stephen pointed in his mail=2C we shouldn't penalise IA implement= ation with unnecessary barriers > > As was discussed at that thread: http://dpdk.org/ml/archives/dev/2015-= March/015202.html > > probably the best is to introduce a new macros: rte_smp_*mb (or somethi= ng) that would be architecture dependent: > > compiler_barrier on IA=2C proper HW barrier on machines with weak memor= y ordering and update the code to use it. > > > > So=2C if you like to fix that issue=2C please do that in a proper way. > > > > BTW=2C I think that for PPC support even before touching ixgbe or any o= ther PMD=2C > > step 3 (or similar) need to be done on rte_ring enqueue/dequeue code. > > > > Konstantin > Yes=2C a new set of macros should be introduced first=2C then we can upd= ate > PMD code. Did anyone are working on it now ? As far as I know=2C no one is working on it right now. So=2C I suppose=2C you are welcome to start :) Konstantin > > Dong > > > >>> > >>> > >>>> rxq->rx_tail =3D rx_id=3B > >>>> > >>>> /* > >>>> @@ -1595=2C6 +1598=2C8 @@ ixgbe_recv_scattered_pkts(void *rx_queue= =2C struct rte_mbuf **rx_pkts=2C > >>>> first_seg =3D NULL=3B > >>>> } > >>>> > >>>> + rte_wmb()=3B > >>>> + > >>>> /* > >>>> * Record index of the next RX descriptor to probe. > >>>> */ > >>>> -- > >>>> 1.9.1 > >>>