From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anssi Hannula Subject: Re: [PATCH 3/3] net: macb: add missing barriers when reading buffers Date: Wed, 12 Dec 2018 13:27:14 +0200 Message-ID: <70b55891-1bc4-be1a-1e45-09f073d3e021@bitwise.fi> References: <20181130182137.27974-1-anssi.hannula@bitwise.fi> <20181130182137.27974-4-anssi.hannula@bitwise.fi> <6378cbaf-2c8d-3c22-2d2d-632c32c6195a@microchip.com> <567c5ea6-d74d-5398-aee5-2b486ddff983@bitwise.fi> <0212fb04-8c2d-264c-a459-6d579a56507e@microchip.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Cc: Nicolas.Ferre@microchip.com, davem@davemloft.net, netdev@vger.kernel.org To: Claudiu.Beznea@microchip.com Return-path: Received: from mail.bitwise.fi ([109.204.228.163]:55488 "EHLO mail.bitwise.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726869AbeLLL12 (ORCPT ); Wed, 12 Dec 2018 06:27:28 -0500 In-Reply-To: <0212fb04-8c2d-264c-a459-6d579a56507e@microchip.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 12.12.2018 12:58, Claudiu.Beznea@microchip.com wrote: > > On 11.12.2018 15:21, Anssi Hannula wrote: >> On 10.12.2018 12:34, Claudiu.Beznea@microchip.com wrote: >>> On 07.12.2018 14:00, Anssi Hannula wrote: >>>> On 6.12.2018 16:14, Claudiu.Beznea@microchip.com wrote: >>>>> Hi Anssi, >>>> Hi! >>>> >>>>> On 05.12.2018 16:00, Anssi Hannula wrote: >>>>>> On 5.12.2018 14:37, Claudiu.Beznea@microchip.com wrote: >>>>>>> On 30.11.2018 20:21, Anssi Hannula wrote: >>>>>>>> When reading buffer descriptors on RX or on TX completion, an >>>>>>>> RX_USED/TX_USED bit is checked first to ensure that the descriptor has >>>>>>>> been populated. However, there are no memory barriers to ensure that the >>>>>>>> data protected by the RX_USED/TX_USED bit is up-to-date with respect to >>>>>>>> that bit. >>>>>>>> >>>>>>>> Fix that by adding DMA read memory barriers on those paths. >>>>>>>> >>>>>>>> I did not observe any actual issues caused by these being missing, >>>>>>>> though. >>>>>>>> >>>>>>>> Tested on a ZynqMP based system. >>>>>>>> >>>>>>>> Signed-off-by: Anssi Hannula >>>>>>>> Fixes: 89e5785fc8a6 ("[PATCH] Atmel MACB ethernet driver") >>>>>>>> Cc: Nicolas Ferre >>>>>>>> --- >>>>>>>> drivers/net/ethernet/cadence/macb_main.c | 20 ++++++++++++++++---- >>>>>>>> 1 file changed, 16 insertions(+), 4 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c >>>>>>>> index 430b7a0f5436..c93baa8621d5 100644 >>>>>>>> --- a/drivers/net/ethernet/cadence/macb_main.c >>>>>>>> +++ b/drivers/net/ethernet/cadence/macb_main.c >>>>>>>> @@ -861,6 +861,11 @@ static void macb_tx_interrupt(struct macb_queue *queue) >>>>>>>> >>>>>>>> /* First, update TX stats if needed */ >>>>>>>> if (skb) { >>>>>>>> + /* Ensure all of desc is at least as up-to-date >>>>>>>> + * as ctrl (TX_USED bit) >>>>>>>> + */ >>>>>>>> + dma_rmb(); >>>>>>>> + >>>>>>> Is this necessary? Wouldn't previous rmb() take care of this? At this time >>>>>>> data specific to this descriptor was completed. The TX descriptors for next >>>>>>> data to be send is updated under a locked spinlock. >>>>>> The previous rmb() is before the TX_USED check, so my understanding is >>>>>> that the following could happen in theory: >>>>> We are using this IP on and ARM architecture, so, with regards to rmb(), I >>>>> understand from [1] that dsb completes when: >>>>> "All explicit memory accesses before this instruction complete. >>>>> All Cache, Branch predictor and TLB maintenance operations before this >>>>> instruction complete." >>>>> >>>>>> 1. rmb(). >>>>> According to [1] this should end after all previous instructions (loads, >>>>> stores) ends. >>>>> >>>>>> 2. Reads are reordered so that TX timestamp is read first - no barriers >>>>>> are crossed. >>>>> But, as per [1], no onward instruction will be reached until all >>>>> instruction prior to dsb ends, so, after rmb() all descriptor's members >>>>> should be updated, right? >>>> The descriptor that triggered the TX interrupt should be visible now, yes. >>>> However, the controller may be writing to any other descriptors at the >>>> same time as the loop is processing through them, as there are multiple >>>> TX buffers. >>> Yes, but there is the "rmb()" after "desc = macb_tx_desc(queue, tail);" at >>> the beginning of loop intended for that... In the 2nd loop you are >>> operating with the same descriptor which was read in the first loop. >> I'm concerned about the 2nd iteration of the first for loop in >> macb_tx_interrupt(). >> That operates on a different descriptor due to tail++, and that >> different descriptor may have in-flight writes from controller as the >> interrupt may or may not already be raised for it. > I agree with that, there may be in-flights updates, but since ownership bit > should be updated at the end of the TX (I expect at this moment to be > updated all the descriptor's parts, including timestamps), if you read a > descriptor and it hasn't been treated by the hardware the descriptor is own > by the hardware and there is this check after ctrl is read: > > if (!(ctrl & MACB_BIT(TX_USED))) > break; > > Next time a new tx interrupt will be raised this descriptor would be also > treated. > > In case the above instruction doesn't break then it means the descriptor > was transmitted. And since ownership should be updated at the end of the > TX, this means no in-flights updates should be done on this descriptor, so, > when you will read the timestamps these should be the good ones (even it is > the first one, the 3rd, and so on). Agreed with above, the only problem is if the ts1/ts2 load is reordered before ctrl load. > Moreover, this new barrier barrier is in the 2nd loop, you're placing a > memory barrier after every step in the 2nd loop for the same descriptor > read in the 1st loop (it was already loaded in a processor register). The barrier is inside "if (skb)", and skb is only set for the last frame as indicated by the code at the end of the loop:             /* skb is set only for the last buffer of the frame.              * WARNING: at this point skb has been freed by              * macb_tx_unmap().              */             if (skb)                 break; > To avoid reordering of ctrl and ts1/ts2 loads wouldn't be enough to have > memory barrier after: > ctrl = desc->ctrl; Yes, it would. My v2 moved it into the other direction, into the ts code, though, which has the added benefit of no barrier when timestamps are disabled (which IIRC is the default). But I can move it to just after ctrl if you so prefer. >> Or am I missing something? >> >>>>>> 3. HW writes timestamp and sets TX_USED (or they become visible). >>>>> I expect hardware to set TX_USED and timestamp before raising TX complete >>>>> interrupt. If so, there should be no on-flight updates of this descriptor, >>>>> right? Hardware raised a TX_USED bit read interrupt when it reads a >>>>> descriptor like this and hangs TX. >>>> For the first iteration of the loop, that is correct - there should be >>>> no in-flight writes from controller as it already raised the interrupt. >>>> However, the following iterations of the loop process descriptors that >>>> may or may not have the interrupt raised yet, and therefore may still >>>> have in-flight writes. >>> I expect the hardware to give up ownership of the descriptor just at the >>> end of TX operation, so that the word containing TX_USED to be the last one >>> updated. If you reads the descriptor and TX_USED is not there the first >>> loop breaks, queue->tx_tail is updated with the last processed descriptor >>> in the queue (see "queue->tx_tail = tail;") then the next interrupt will >>> start processing with this last one descriptor. >> Agreed, that is how it works. Issues only occur if we somehow operate on >> data before ownership was transferred. > Could this happens? Shouldn't this: > if (!(ctrl & MACB_BIT(TX_USED))) > break; > > assure that this could not happen and the fact that the ownership is passed > to CPU at the end of descriptor update? I don't see how that prevents desc->ctrl and desc->ts_1/ts_2 reordering, which is the only issue that I can see. By "data" above I meant ts_1/ts_2, AFAICS there is nothing else we read from the descriptors aside from ctrl itself. > If load reordering b/w desc->ctrl and desc->ts_1/ts_2 wouldn't be enough to > have a barrier after > ctrl = desc->ctrl instruction? Yes, per above. >>>>>> 4. Code checks TX_USED. >>>>>> 5. Code operates on timestamp that is actually garbage. >>>>>> >>>>>> I'm not 100% sure that there isn't some lighter/cleaner way to do this >>>>>> than dma_rmb(), though. >>>>> If you still think this scenario could happen why not calling a dsb in >>>>> gem_ptp_do_timestamp(). I feel like that is a proper place to call it. >>>> OK, I will move it there. Unless we arrive at a conclusion that it is >>>> unnecessary altogether, of course :) >>>> >>>>> Moreover, there is bit 32 of desc->ctrl which tells you if a valid >>>>> timestamp was placed in the descriptor. But, again, I expect the timestamp >>>>> and TX_USED to be set by hardware before raising TX complete interrupt. >>>> Yes, but since my concern is that without barriers in between, >>>> desc->ctrl might be read after ts_1/ts_2, so that bit might be seen as >>>> set even though ts_1 is not yet an actual timestamp. And per above, all >>>> this may occur before the TX complete interrupt is raised for the >>>> descriptor in question. >>> If so, why not placing the barrier when reading timestamps? From my point >>> of view the place of "dma_rmb()" in this patch doesn't guarantees that >>> ts1/ts2 were read after the execution of barrier (correct me if I'm wrong). >> If the timestamp ts1/ts2 reads are done after dma_rmb(), and dev->ctrl >> is read before the barrier, my understanding is that they are guaranteed >> to be ordered so that dev->ctrl is read before ts1/ts2. >> >> Per [1], "[DMB] ensures that all explicit memory accesses that appear in >> program order before the |DMB| instruction are observed before any >> explicit memory accesses that appear in program order after the |DMB| >> instruction". >> And the compiler barrier ("memory" asm clobber) included within >> dma_rmb() ensures that in program order, ctrl is loaded before the >> barrier, and ts1/ts2 after it. >> >> But as mentioned, it makes sense to have it closer to the ts1/ts2 reads >> so I'll make that change. >> >> [1] >> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0489c/CIHGHHIE.html >> >> >>>> I agree that this TX case seems somewhat unlikely to be triggered in >>>> real life at least at present, though, as the ts_1/ts_2 read is so far >>>> after the desc->ctrl read, and behind function calls, so unlikely to be >>>> reordered before desc->ctrl read. >>>> But the similar RX cases below seem more problematic as the racy loads >>>> in question are right after each other in code. >>>> [...] -- Anssi Hannula / Bitwise Oy +358 503803997