From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Bruens Date: Sun, 2 Apr 2017 23:34:13 +0200 Subject: [U-Boot] [PATCH v5 02/19] usb: dwc2: Use separate input and output buffers In-Reply-To: References: <20170401180556.2416-1-sjg@chromium.org> <7825980.L7YSPUyO6P@pebbles.site> Message-ID: <24448596.dUqkTcymFV@pebbles.site> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit To: u-boot@lists.denx.de On Sonntag, 2. April 2017 17:43:38 CEST Simon Glass wrote: > Hi Stefan, > > On 2 April 2017 at 07:10, Stefan Bruens wrote: > > On Sonntag, 2. April 2017 05:01:41 CEST Marek Vasut wrote: > >> On 04/02/2017 01:40 AM, Simon Glass wrote: > >> > Hi Marek, > >> > > >> > On 1 April 2017 at 14:15, Marek Vasut wrote: > >> >> On 04/01/2017 08:05 PM, Simon Glass wrote: > >> >>> On Raspberry Pi 2 and 3 a problem was noticed when enabling driver > >> >>> model > >> >>> for USB: the cache invalidate after an incoming transfer does not > >> >>> seem > >> >>> to > >> >>> work correctly. > >> >>> > >> >>> This may be a problem with the underlying caching implementation on > >> >>> armv7 > >> >>> and armv8 but this seems very unlikely. As a work-around, use > >> >>> separate > >> >>> buffers for input and output. This ensures that the input buffer will > >> >>> not > >> >>> hold dirty cache data. > >> >> > >> >> What do you think of this patch: > >> >> [U-Boot] usb: dwc2: invalidate the dcache before starting the DMA > >> > > >> > Yes that matches what I did as a hack. I didn't realise that the DMA > >> > would go through the cache. Thanks for the pointer. > >> > >> DMA should not go through the cache. I have yet to review that patch, > >> but IMO it's relevant to this problem you observe. > > > > DMA transfers not going through the cache is probably the problem here: > > > > Assume we have the aligned_buffer at address 0xdead0000 > > > > 1. The cpu writes to address 0xdead0002. This is fine, as it is the > > current > > owner of the address. The cacheline is marked dirty. > > 2. The cpu no longer needs the corresponding address range, and it is > > reallocated (i.e. freed and then allocated from dwc2) or reused (i.e. > > formerly out buffer, now in buffer). > > 3. The CPU starts the DMA transfer > > 4. The DMA transfer writes to e.g. 0xdead0000-0xdead0200 in memory. > > 5. The CPU fetches an address aliasing with 0xdead0000. The dirty cache > > line is evicted, and the 0xdead0000-0xdead0040 memory contents are > > overwritten. > This is the part I don't understand. This should be an invalidate, not > a clean and invalidate, so there should be not memory write. > > Also if the CPU fetches from cached 0xdead0000 without an invalidate, > it will not cause a cash clean. It will simple read the data from the > cache and ignore what the DMA wrote. The CPU does not fetch 0xdead0000, but from an address *aliasing* with 0xdead000. As 0xdead0000 is *dirty* (we have neither flushed (clears dirty bit) or invalidated (implicitly clears dirty for the address)), the cache controller has to write out the 0xdead0000 cache line to memory. > On armv8 we appear not to suppose invalidate in the code, so it makes > sense for rpi_3. > But for rpi_2 which seems to do a proper invalidate, I still don't see > the problem. Which part of the code is different between rpi2 and rpi3? The dwc2 code is identical, is the memory invalidated in some other place? > > Obviously, the dirty cache line from (1.) has to be cleared at the > > beginning of (3.), as Eddys patch does. > > But I still don't understand why we have to clean instead of just > invalidate? The patch by Eddie Cai just does an invalidate_dcache_range on the transfer buffer, nothing else. Where do you see a "clean" (whatever that refers to)? Kind regards, Stefan -- Stefan Brüns / Bergstraße 21 / 52062 Aachen home: +49 241 53809034 mobile: +49 151 50412019 work: +49 2405 49936-424