From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anssi Hannula Subject: Re: [PATCH 3/3] net: macb: add missing barriers when reading buffers Date: Wed, 5 Dec 2018 16:00:05 +0200 Message-ID: References: <20181130182137.27974-1-anssi.hannula@bitwise.fi> <20181130182137.27974-4-anssi.hannula@bitwise.fi> <6378cbaf-2c8d-3c22-2d2d-632c32c6195a@microchip.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: Nicolas.Ferre@microchip.com, davem@davemloft.net, netdev@vger.kernel.org To: Claudiu.Beznea@microchip.com Return-path: Received: from mail.bitwise.fi ([109.204.228.163]:44260 "EHLO mail.bitwise.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726918AbeLEOAT (ORCPT ); Wed, 5 Dec 2018 09:00:19 -0500 In-Reply-To: <6378cbaf-2c8d-3c22-2d2d-632c32c6195a@microchip.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 5.12.2018 14:37, Claudiu.Beznea@microchip.com wrote: > > On 30.11.2018 20:21, Anssi Hannula wrote: >> When reading buffer descriptors on RX or on TX completion, an >> RX_USED/TX_USED bit is checked first to ensure that the descriptor has >> been populated. However, there are no memory barriers to ensure that the >> data protected by the RX_USED/TX_USED bit is up-to-date with respect to >> that bit. >> >> Fix that by adding DMA read memory barriers on those paths. >> >> I did not observe any actual issues caused by these being missing, >> though. >> >> Tested on a ZynqMP based system. >> >> Signed-off-by: Anssi Hannula >> Fixes: 89e5785fc8a6 ("[PATCH] Atmel MACB ethernet driver") >> Cc: Nicolas Ferre >> --- >> drivers/net/ethernet/cadence/macb_main.c | 20 ++++++++++++++++---- >> 1 file changed, 16 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c >> index 430b7a0f5436..c93baa8621d5 100644 >> --- a/drivers/net/ethernet/cadence/macb_main.c >> +++ b/drivers/net/ethernet/cadence/macb_main.c >> @@ -861,6 +861,11 @@ static void macb_tx_interrupt(struct macb_queue *queue) >> >> /* First, update TX stats if needed */ >> if (skb) { >> + /* Ensure all of desc is at least as up-to-date >> + * as ctrl (TX_USED bit) >> + */ >> + dma_rmb(); >> + > Is this necessary? Wouldn't previous rmb() take care of this? At this time > data specific to this descriptor was completed. The TX descriptors for next > data to be send is updated under a locked spinlock. The previous rmb() is before the TX_USED check, so my understanding is that the following could happen in theory: 1. rmb(). 2. Reads are reordered so that TX timestamp is read first - no barriers are crossed. 3. HW writes timestamp and sets TX_USED (or they become visible). 4. Code checks TX_USED. 5. Code operates on timestamp that is actually garbage. I'm not 100% sure that there isn't some lighter/cleaner way to do this than dma_rmb(), though. >> if (gem_ptp_do_txstamp(queue, skb, desc) == 0) { >> /* skb now belongs to timestamp buffer >> * and will be removed later >> @@ -1000,12 +1005,16 @@ static int gem_rx(struct macb_queue *queue, int budget) >> rmb(); >> >> rxused = (desc->addr & MACB_BIT(RX_USED)) ? true : false; >> - addr = macb_get_addr(bp, desc); >> - ctrl = desc->ctrl; >> >> if (!rxused) >> break; >> >> + /* Ensure other data is at least as up-to-date as rxused */ >> + dma_rmb(); > Same here, wouldn't previous rmb() should do this job? The scenario I'm concerned about here (and in the last hunk) is: 1. rmb(). 2. ctrl is read (i.e. ctrl read reordered before addr read). 3. HW updates to ctrl and addr become visible. 4. RX_USED check. 5. code operates on garbage ctrl. I think it may be OK to move the earlier rmb() outside the loop so that there is an rmb() only before and after the RX loop, as I don't at least immediately see any hard requirement to do it on each loop pass (unlike the added dma_rmb()). But my intent was to fix issues instead of optimization so I didn't look too closely into that. >> + >> + addr = macb_get_addr(bp, desc); >> + ctrl = desc->ctrl; >> + >> queue->rx_tail++; >> count++; >> >> @@ -1180,11 +1189,14 @@ static int macb_rx(struct macb_queue *queue, int budget) >> /* Make hw descriptor updates visible to CPU */ >> rmb(); >> >> - ctrl = desc->ctrl; >> - >> if (!(desc->addr & MACB_BIT(RX_USED))) >> break; >> >> + /* Ensure other data is at least as up-to-date as addr */ >> + dma_rmb(); > Ditto > >> + >> + ctrl = desc->ctrl; >> + >> if (ctrl & MACB_BIT(RX_SOF)) { >> if (first_frag != -1) >> discard_partial_frame(queue, first_frag, tail); >> -- Anssi Hannula / Bitwise Oy +358 503803997