All of lore.kernel.org
 help / color / mirror / Atom feed
* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
@ 2017-04-03 17:49 Philipp Tomsich
  2017-04-04 16:15 ` Marek Vasut
  0 siblings, 1 reply; 17+ messages in thread
From: Philipp Tomsich @ 2017-04-03 17:49 UTC (permalink / raw)
  To: u-boot

Merely using dma_alloc_coherent does not ensure that there is no stale
data left in the caches for the allocated DMA buffer (i.e. that the
affected cacheline may still be dirty).

The original code was doing the following (on AArch64, which
translates a 'flush' into a 'clean + invalidate'):
  # during initialisation:
      1. allocate buffers via memalign
      	 => buffers may still be modified (cached, dirty)
  # during interrupt processing
      2. clean + invalidate buffers
      	 => may commit stale data from a modified cacheline
      3. read from buffers

This could lead to garbage info being written to buffers before
reading them during even-processing.

To make the event processing more robust, we use the following sequence
for the cache-maintenance:
  # during initialisation:
      1. allocate buffers via memalign
      2. clean + invalidate buffers
      	 (we only need the 'invalidate' part, but dwc3_flush_cache()
	  always performs a 'clean + invalidate')
  # during interrupt processing
      3. read the buffers
      	 (we know these lines are not cached, due to the previous
	  invalidation and no other code touching them in-between)
      4. clean + invalidate buffers
      	 => writes back any modification we may have made during event
	    processing and ensures that the lines are not in the cache
	    the next time we enter interrupt processing

Note that with the original sequence, we observe reproducible
(depending on the cache state: i.e. running dhcp/usb start before will
upset caches to get us around this) issues in the event processing (a
fatal synchronous abort in dwc3_gadget_uboot_handle_interrupt on the
first time interrupt handling is invoked) when running USB mass
storage emulation on our RK3399-Q7 with data-caches on.

Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>

---

 drivers/usb/dwc3/core.c   | 2 ++
 drivers/usb/dwc3/gadget.c | 5 +++--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index b2c7eb1..f58c7ba 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -125,6 +125,8 @@ static struct dwc3_event_buffer *dwc3_alloc_one_event_buffer(struct dwc3 *dwc,
 	if (!evt->buf)
 		return ERR_PTR(-ENOMEM);
 
+	dwc3_flush_cache((long)evt->buf, evt->length);
+
 	return evt;
 }
 
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 1156662..61af71b 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2668,11 +2668,12 @@ void dwc3_gadget_uboot_handle_interrupt(struct dwc3 *dwc)
 		int i;
 		struct dwc3_event_buffer *evt;
 
+		dwc3_thread_interrupt(0, dwc);
+
+		/* Clean + Invalidate the buffers after touching them */
 		for (i = 0; i < dwc->num_event_buffers; i++) {
 			evt = dwc->ev_buffs[i];
 			dwc3_flush_cache((long)evt->buf, evt->length);
 		}
-
-		dwc3_thread_interrupt(0, dwc);
 	}
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-03 17:49 [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust Philipp Tomsich
@ 2017-04-04 16:15 ` Marek Vasut
  2017-04-04 17:46   ` Dr. Philipp Tomsich
  2017-04-05  8:15   ` Felipe Balbi
  0 siblings, 2 replies; 17+ messages in thread
From: Marek Vasut @ 2017-04-04 16:15 UTC (permalink / raw)
  To: u-boot

On 04/03/2017 07:49 PM, Philipp Tomsich wrote:
> Merely using dma_alloc_coherent does not ensure that there is no stale
> data left in the caches for the allocated DMA buffer (i.e. that the
> affected cacheline may still be dirty).
> 
> The original code was doing the following (on AArch64, which
> translates a 'flush' into a 'clean + invalidate'):
>   # during initialisation:
>       1. allocate buffers via memalign
>       	 => buffers may still be modified (cached, dirty)
>   # during interrupt processing
>       2. clean + invalidate buffers
>       	 => may commit stale data from a modified cacheline
>       3. read from buffers
> 
> This could lead to garbage info being written to buffers before
> reading them during even-processing.
> 
> To make the event processing more robust, we use the following sequence
> for the cache-maintenance:
>   # during initialisation:
>       1. allocate buffers via memalign
>       2. clean + invalidate buffers
>       	 (we only need the 'invalidate' part, but dwc3_flush_cache()
> 	  always performs a 'clean + invalidate')
>   # during interrupt processing
>       3. read the buffers
>       	 (we know these lines are not cached, due to the previous
> 	  invalidation and no other code touching them in-between)
>       4. clean + invalidate buffers
>       	 => writes back any modification we may have made during event
> 	    processing and ensures that the lines are not in the cache
> 	    the next time we enter interrupt processing
> 
> Note that with the original sequence, we observe reproducible
> (depending on the cache state: i.e. running dhcp/usb start before will
> upset caches to get us around this) issues in the event processing (a
> fatal synchronous abort in dwc3_gadget_uboot_handle_interrupt on the
> first time interrupt handling is invoked) when running USB mass
> storage emulation on our RK3399-Q7 with data-caches on.
> 
> Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
> 
> ---
> 
>  drivers/usb/dwc3/core.c   | 2 ++
>  drivers/usb/dwc3/gadget.c | 5 +++--
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
> index b2c7eb1..f58c7ba 100644
> --- a/drivers/usb/dwc3/core.c
> +++ b/drivers/usb/dwc3/core.c
> @@ -125,6 +125,8 @@ static struct dwc3_event_buffer *dwc3_alloc_one_event_buffer(struct dwc3 *dwc,
>  	if (!evt->buf)
>  		return ERR_PTR(-ENOMEM);
>  
> +	dwc3_flush_cache((long)evt->buf, evt->length);
> +

Is the length aligned ? If not, you will get cache alignment warning.
Also, address should be uintptr_t to avoid 32/64 bit issues .

>  	return evt;
>  }
>  
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 1156662..61af71b 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -2668,11 +2668,12 @@ void dwc3_gadget_uboot_handle_interrupt(struct dwc3 *dwc)
>  		int i;
>  		struct dwc3_event_buffer *evt;
>  
> +		dwc3_thread_interrupt(0, dwc);
> +
> +		/* Clean + Invalidate the buffers after touching them */
>  		for (i = 0; i < dwc->num_event_buffers; i++) {
>  			evt = dwc->ev_buffs[i];
>  			dwc3_flush_cache((long)evt->buf, evt->length);
>  		}
> -

This makes me wonder, don't you need to invalidate the event buffer
somewhere so that the new data would be fetched from RAM ?

> -		dwc3_thread_interrupt(0, dwc);
>  	}
>  }
> 

One last thing, is this patch needed in Linux too ?

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-04 16:15 ` Marek Vasut
@ 2017-04-04 17:46   ` Dr. Philipp Tomsich
  2017-04-04 19:01     ` Marek Vasut
  2017-04-05  8:15   ` Felipe Balbi
  1 sibling, 1 reply; 17+ messages in thread
From: Dr. Philipp Tomsich @ 2017-04-04 17:46 UTC (permalink / raw)
  To: u-boot


> On 04 Apr 2017, at 18:15, Marek Vasut <marex@denx.de> wrote:
> 
> On 04/03/2017 07:49 PM, Philipp Tomsich wrote:
>> Merely using dma_alloc_coherent does not ensure that there is no stale
>> data left in the caches for the allocated DMA buffer (i.e. that the
>> affected cacheline may still be dirty).
>> 
>> The original code was doing the following (on AArch64, which
>> translates a 'flush' into a 'clean + invalidate'):
>>  # during initialisation:
>>      1. allocate buffers via memalign
>>      	 => buffers may still be modified (cached, dirty)
>>  # during interrupt processing
>>      2. clean + invalidate buffers
>>      	 => may commit stale data from a modified cacheline
>>      3. read from buffers
>> 
>> This could lead to garbage info being written to buffers before
>> reading them during even-processing.
>> 
>> To make the event processing more robust, we use the following sequence
>> for the cache-maintenance:
>>  # during initialisation:
>>      1. allocate buffers via memalign
>>      2. clean + invalidate buffers
>>      	 (we only need the 'invalidate' part, but dwc3_flush_cache()
>> 	  always performs a 'clean + invalidate')
>>  # during interrupt processing
>>      3. read the buffers
>>      	 (we know these lines are not cached, due to the previous
>> 	  invalidation and no other code touching them in-between)
>>      4. clean + invalidate buffers
>>      	 => writes back any modification we may have made during event
>> 	    processing and ensures that the lines are not in the cache
>> 	    the next time we enter interrupt processing
>> 
>> Note that with the original sequence, we observe reproducible
>> (depending on the cache state: i.e. running dhcp/usb start before will
>> upset caches to get us around this) issues in the event processing (a
>> fatal synchronous abort in dwc3_gadget_uboot_handle_interrupt on the
>> first time interrupt handling is invoked) when running USB mass
>> storage emulation on our RK3399-Q7 with data-caches on.
>> 
>> Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
>> 
>> ---
>> 
>> drivers/usb/dwc3/core.c   | 2 ++
>> drivers/usb/dwc3/gadget.c | 5 +++--
>> 2 files changed, 5 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
>> index b2c7eb1..f58c7ba 100644
>> --- a/drivers/usb/dwc3/core.c
>> +++ b/drivers/usb/dwc3/core.c
>> @@ -125,6 +125,8 @@ static struct dwc3_event_buffer *dwc3_alloc_one_event_buffer(struct dwc3 *dwc,
>> 	if (!evt->buf)
>> 		return ERR_PTR(-ENOMEM);
>> 
>> +	dwc3_flush_cache((long)evt->buf, evt->length);
>> +
> 
> Is the length aligned ? If not, you will get cache alignment warning.
> Also, address should be uintptr_t to avoid 32/64 bit issues .

The length is a well-known value and aligned (it expands to PAGE_SIZE in the end).
Good point on the “long”, especially as I just copied this from other occurences and it’s consistently wrong throughout DWC3 in U-Boot:
	drivers/usb/dwc3/core.c:        dwc3_flush_cache((long)evt->buf, evt->length);
	drivers/usb/dwc3/ep0.c: dwc3_flush_cache((long)buf_dma, len);
	drivers/usb/dwc3/ep0.c: dwc3_flush_cache((long)trb, sizeof(*trb));
	drivers/usb/dwc3/ep0.c: dwc3_flush_cache((long)trb, sizeof(*trb));
	drivers/usb/dwc3/ep0.c:                 dwc3_flush_cache((long)trb, sizeof(*trb));
	drivers/usb/dwc3/ep0.c:         dwc3_flush_cache((long)dwc->ep0_bounce, DWC3_EP0_BOUNCE_SIZE);
	drivers/usb/dwc3/gadget.c:      dwc3_flush_cache((long)req->request.dma, req->request.length);
	drivers/usb/dwc3/gadget.c:      dwc3_flush_cache((long)dma, length);
	drivers/usb/dwc3/gadget.c:      dwc3_flush_cache((long)trb, sizeof(*trb));
	drivers/usb/dwc3/gadget.c:      dwc3_flush_cache((long)trb, sizeof(*trb));
	drivers/usb/dwc3/gadget.c:                      dwc3_flush_cache((long)evt->buf, evt->length);
	drivers/usb/dwc3/io.h:static inline void dwc3_flush_cache(int addr, int length)

Worst of all: the definition of dwc3_flush_cache in io.h has “int” as a type, which will eat us alive if the DWC3’s physical address is beyond 32-bit.

I’ll revise all of these and make a patch-series out of this.

>> 	return evt;
>> }
>> 
>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> index 1156662..61af71b 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -2668,11 +2668,12 @@ void dwc3_gadget_uboot_handle_interrupt(struct dwc3 *dwc)
>> 		int i;
>> 		struct dwc3_event_buffer *evt;
>> 
>> +		dwc3_thread_interrupt(0, dwc);
>> +
>> +		/* Clean + Invalidate the buffers after touching them */
>> 		for (i = 0; i < dwc->num_event_buffers; i++) {
>> 			evt = dwc->ev_buffs[i];
>> 			dwc3_flush_cache((long)evt->buf, evt->length);
>> 		}
>> -
> 
> This makes me wonder, don't you need to invalidate the event buffer
> somewhere so that the new data would be fetched from RAM ?

We flush the event buffer before leaving the function.
So the cache line will not be present in the cache, when we enter this function again.

>> -		dwc3_thread_interrupt(0, dwc);
>> 	}
>> }
>> 
> 
> One last thing, is this patch needed in Linux too ?

Linux deals properly with DMA allocations and manages them in appropriate memory regions (e.g. marked uncached).
Also, some of the affected code-paths are U-Boot specific.

This really stems from a limitation of the way the DMA areas are allocated in U-Boot (i.e. from the heap, using a memalign) and how the cache-operations have been sequenced relative to the other code in the port to U-Boot.

Regards,
Philipp.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-04 17:46   ` Dr. Philipp Tomsich
@ 2017-04-04 19:01     ` Marek Vasut
  2017-04-04 19:56       ` Dr. Philipp Tomsich
  2017-04-05  8:18       ` Felipe Balbi
  0 siblings, 2 replies; 17+ messages in thread
From: Marek Vasut @ 2017-04-04 19:01 UTC (permalink / raw)
  To: u-boot

On 04/04/2017 07:46 PM, Dr. Philipp Tomsich wrote:
> 
>> On 04 Apr 2017, at 18:15, Marek Vasut <marex@denx.de> wrote:
>>
>> On 04/03/2017 07:49 PM, Philipp Tomsich wrote:
>>> Merely using dma_alloc_coherent does not ensure that there is no stale
>>> data left in the caches for the allocated DMA buffer (i.e. that the
>>> affected cacheline may still be dirty).
>>>
>>> The original code was doing the following (on AArch64, which
>>> translates a 'flush' into a 'clean + invalidate'):
>>>  # during initialisation:
>>>      1. allocate buffers via memalign
>>>      	 => buffers may still be modified (cached, dirty)
>>>  # during interrupt processing
>>>      2. clean + invalidate buffers
>>>      	 => may commit stale data from a modified cacheline
>>>      3. read from buffers
>>>
>>> This could lead to garbage info being written to buffers before
>>> reading them during even-processing.
>>>
>>> To make the event processing more robust, we use the following sequence
>>> for the cache-maintenance:
>>>  # during initialisation:
>>>      1. allocate buffers via memalign
>>>      2. clean + invalidate buffers
>>>      	 (we only need the 'invalidate' part, but dwc3_flush_cache()
>>> 	  always performs a 'clean + invalidate')
>>>  # during interrupt processing
>>>      3. read the buffers
>>>      	 (we know these lines are not cached, due to the previous
>>> 	  invalidation and no other code touching them in-between)
>>>      4. clean + invalidate buffers
>>>      	 => writes back any modification we may have made during event
>>> 	    processing and ensures that the lines are not in the cache
>>> 	    the next time we enter interrupt processing
>>>
>>> Note that with the original sequence, we observe reproducible
>>> (depending on the cache state: i.e. running dhcp/usb start before will
>>> upset caches to get us around this) issues in the event processing (a
>>> fatal synchronous abort in dwc3_gadget_uboot_handle_interrupt on the
>>> first time interrupt handling is invoked) when running USB mass
>>> storage emulation on our RK3399-Q7 with data-caches on.
>>>
>>> Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
>>>
>>> ---
>>>
>>> drivers/usb/dwc3/core.c   | 2 ++
>>> drivers/usb/dwc3/gadget.c | 5 +++--
>>> 2 files changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
>>> index b2c7eb1..f58c7ba 100644
>>> --- a/drivers/usb/dwc3/core.c
>>> +++ b/drivers/usb/dwc3/core.c
>>> @@ -125,6 +125,8 @@ static struct dwc3_event_buffer *dwc3_alloc_one_event_buffer(struct dwc3 *dwc,
>>> 	if (!evt->buf)
>>> 		return ERR_PTR(-ENOMEM);
>>>
>>> +	dwc3_flush_cache((long)evt->buf, evt->length);
>>> +
>>
>> Is the length aligned ? If not, you will get cache alignment warning.
>> Also, address should be uintptr_t to avoid 32/64 bit issues .
> 
> The length is a well-known value and aligned (it expands to PAGE_SIZE in the end).

Uh, the event buffer is 4k ? That's quite big, but OK.

> Good point on the “long”, especially as I just copied this from other occurences and it’s consistently wrong throughout DWC3 in U-Boot:

Hrm, I thought the driver was ported over from Linux, so is this broken
in Linux too ?

> 	drivers/usb/dwc3/core.c:        dwc3_flush_cache((long)evt->buf, evt->length);
> 	drivers/usb/dwc3/ep0.c: dwc3_flush_cache((long)buf_dma, len);
> 	drivers/usb/dwc3/ep0.c: dwc3_flush_cache((long)trb, sizeof(*trb));
> 	drivers/usb/dwc3/ep0.c: dwc3_flush_cache((long)trb, sizeof(*trb));
> 	drivers/usb/dwc3/ep0.c:                 dwc3_flush_cache((long)trb, sizeof(*trb));
> 	drivers/usb/dwc3/ep0.c:         dwc3_flush_cache((long)dwc->ep0_bounce, DWC3_EP0_BOUNCE_SIZE);
> 	drivers/usb/dwc3/gadget.c:      dwc3_flush_cache((long)req->request.dma, req->request.length);
> 	drivers/usb/dwc3/gadget.c:      dwc3_flush_cache((long)dma, length);
> 	drivers/usb/dwc3/gadget.c:      dwc3_flush_cache((long)trb, sizeof(*trb));
> 	drivers/usb/dwc3/gadget.c:      dwc3_flush_cache((long)trb, sizeof(*trb));
> 	drivers/usb/dwc3/gadget.c:                      dwc3_flush_cache((long)evt->buf, evt->length);
> 	drivers/usb/dwc3/io.h:static inline void dwc3_flush_cache(int addr, int length)
> 
> Worst of all: the definition of dwc3_flush_cache in io.h has “int” as a type, which will eat us alive if the DWC3’s physical address is beyond 32-bit.
> 
> I’ll revise all of these and make a patch-series out of this.

Maybe you should check the Linux first and see if there are some fixes
already.

Thanks

>>> 	return evt;
>>> }
>>>
>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>> index 1156662..61af71b 100644
>>> --- a/drivers/usb/dwc3/gadget.c
>>> +++ b/drivers/usb/dwc3/gadget.c
>>> @@ -2668,11 +2668,12 @@ void dwc3_gadget_uboot_handle_interrupt(struct dwc3 *dwc)
>>> 		int i;
>>> 		struct dwc3_event_buffer *evt;
>>>
>>> +		dwc3_thread_interrupt(0, dwc);
>>> +
>>> +		/* Clean + Invalidate the buffers after touching them */
>>> 		for (i = 0; i < dwc->num_event_buffers; i++) {
>>> 			evt = dwc->ev_buffs[i];
>>> 			dwc3_flush_cache((long)evt->buf, evt->length);
>>> 		}
>>> -
>>
>> This makes me wonder, don't you need to invalidate the event buffer
>> somewhere so that the new data would be fetched from RAM ?
> 
> We flush the event buffer before leaving the function.
> So the cache line will not be present in the cache, when we enter this function again.

Then shouldn't we invalidate it instead ? flush and invalidate are two
different things ...

>>> -		dwc3_thread_interrupt(0, dwc);
>>> 	}
>>> }
>>>
>>
>> One last thing, is this patch needed in Linux too ?
> 
> Linux deals properly with DMA allocations and manages them in appropriate memory regions (e.g. marked uncached).
> Also, some of the affected code-paths are U-Boot specific.
> 
> This really stems from a limitation of the way the DMA areas are allocated in U-Boot (i.e. from the heap, using a memalign) and how the cache-operations have been sequenced relative to the other code in the port to U-Boot.

OK I see, thanks for clarifying!

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-04 19:01     ` Marek Vasut
@ 2017-04-04 19:56       ` Dr. Philipp Tomsich
  2017-04-04 20:09         ` Marek Vasut
  2017-04-05  8:18       ` Felipe Balbi
  1 sibling, 1 reply; 17+ messages in thread
From: Dr. Philipp Tomsich @ 2017-04-04 19:56 UTC (permalink / raw)
  To: u-boot


> On 04 Apr 2017, at 21:01, Marek Vasut <marex@denx.de> wrote:
> 
>> Good point on the “long”, especially as I just copied this from other occurrences and it’s consistently wrong throughout DWC3 in U-Boot:
> 
> Hrm, I thought the driver was ported over from Linux, so is this broken
> in Linux too ?

Apparently, the dwc3_flush_cache calls (and the function itself) have been
introduced during the porting. There’s no explicit cache-maintenance in DWC3
for Linux. 

>> I’ll revise all of these and make a patch-series out of this.
> 
> Maybe you should check the Linux first and see if there are some fixes
> already.
> 
> Thanks

Given that this seems to have been introduced with the port to U-Boot, there’s
no applicable fixes there.

>>>> 	return evt;
>>>> }
>>>> 
>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>> index 1156662..61af71b 100644
>>>> --- a/drivers/usb/dwc3/gadget.c
>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>> @@ -2668,11 +2668,12 @@ void dwc3_gadget_uboot_handle_interrupt(struct dwc3 *dwc)
>>>> 		int i;
>>>> 		struct dwc3_event_buffer *evt;
>>>> 
>>>> +		dwc3_thread_interrupt(0, dwc);
>>>> +
>>>> +		/* Clean + Invalidate the buffers after touching them */
>>>> 		for (i = 0; i < dwc->num_event_buffers; i++) {
>>>> 			evt = dwc->ev_buffs[i];
>>>> 			dwc3_flush_cache((long)evt->buf, evt->length);
>>>> 		}
>>>> -
>>> 
>>> This makes me wonder, don't you need to invalidate the event buffer
>>> somewhere so that the new data would be fetched from RAM ?
>> 
>> We flush the event buffer before leaving the function.
>> So the cache line will not be present in the cache, when we enter this function again.
> 
> Then shouldn't we invalidate it instead ? flush and invalidate are two
> different things …

The DWC3 flush expands to a clean+invalidate. It is not wrong, as long as
it is used as in my patch:
a. before the first time data is expected to be written by the peripheral (i.e.
before the peripheral is started)—to ensure that the cache line is not cached
any longer…
b. after the driver modifies any buffers (i.e. anything modified will be written
back) and before it next reads the buffers expecting possibly changed data
(i.e. invalidating).

Regards,
Philipp.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-04 19:56       ` Dr. Philipp Tomsich
@ 2017-04-04 20:09         ` Marek Vasut
  2017-04-04 20:26           ` Dr. Philipp Tomsich
  0 siblings, 1 reply; 17+ messages in thread
From: Marek Vasut @ 2017-04-04 20:09 UTC (permalink / raw)
  To: u-boot

On 04/04/2017 09:56 PM, Dr. Philipp Tomsich wrote:
> 
>> On 04 Apr 2017, at 21:01, Marek Vasut <marex@denx.de> wrote:
>>
>>> Good point on the “long”, especially as I just copied this from other occurrences and it’s consistently wrong throughout DWC3 in U-Boot:
>>
>> Hrm, I thought the driver was ported over from Linux, so is this broken
>> in Linux too ?
> 
> Apparently, the dwc3_flush_cache calls (and the function itself) have been
> introduced during the porting. There’s no explicit cache-maintenance in DWC3
> for Linux. 

OK

>>> I’ll revise all of these and make a patch-series out of this.
>>
>> Maybe you should check the Linux first and see if there are some fixes
>> already.
>>
>> Thanks
> 
> Given that this seems to have been introduced with the port to U-Boot, there’s
> no applicable fixes there.

OK

>>>>> 	return evt;
>>>>> }
>>>>>
>>>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>>>> index 1156662..61af71b 100644
>>>>> --- a/drivers/usb/dwc3/gadget.c
>>>>> +++ b/drivers/usb/dwc3/gadget.c
>>>>> @@ -2668,11 +2668,12 @@ void dwc3_gadget_uboot_handle_interrupt(struct dwc3 *dwc)
>>>>> 		int i;
>>>>> 		struct dwc3_event_buffer *evt;
>>>>>
>>>>> +		dwc3_thread_interrupt(0, dwc);
>>>>> +
>>>>> +		/* Clean + Invalidate the buffers after touching them */
>>>>> 		for (i = 0; i < dwc->num_event_buffers; i++) {
>>>>> 			evt = dwc->ev_buffs[i];
>>>>> 			dwc3_flush_cache((long)evt->buf, evt->length);
>>>>> 		}
>>>>> -
>>>>
>>>> This makes me wonder, don't you need to invalidate the event buffer
>>>> somewhere so that the new data would be fetched from RAM ?
>>>
>>> We flush the event buffer before leaving the function.
>>> So the cache line will not be present in the cache, when we enter this function again.
>>
>> Then shouldn't we invalidate it instead ? flush and invalidate are two
>> different things …
> 
> The DWC3 flush expands to a clean+invalidate. It is not wrong, as long as
> it is used as in my patch:
> a. before the first time data is expected to be written by the peripheral (i.e.
> before the peripheral is started)—to ensure that the cache line is not cached
> any longer…

So invalidate() is enough ?

> b. after the driver modifies any buffers (i.e. anything modified will be written
> back) and before it next reads the buffers expecting possibly changed data
> (i.e. invalidating).

So flush+invalidate ? Keep in mind this driver may not be used on
ARMv7/v8 only ...

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-04 20:09         ` Marek Vasut
@ 2017-04-04 20:26           ` Dr. Philipp Tomsich
  2017-04-05 10:25             ` Marek Vasut
  0 siblings, 1 reply; 17+ messages in thread
From: Dr. Philipp Tomsich @ 2017-04-04 20:26 UTC (permalink / raw)
  To: u-boot


> On 04 Apr 2017, at 22:09, Marek Vasut <marex@denx.de> wrote:
> 
>> The DWC3 flush expands to a clean+invalidate. It is not wrong, as long as
>> it is used as in my patch:
>> a. before the first time data is expected to be written by the peripheral (i.e.
>> before the peripheral is started)—to ensure that the cache line is not cached
>> any longer…
> 
> So invalidate() is enough ?

If I had to write this from scratch, I’d got with the paranoid sequence of:

	handler():
	{
		invalidate
		do my stuff
		clean
	}

However, some architectures in U-Boot (e.g. ARMv8) don’t implement the
invalidate verb. Given this, I’d rather stay as close to what’s already there.

Note that using flush (i.e. clean+invalidate) aligns with how caches are
managed throughout various other drivers in U-Boot.

> 
>> b. after the driver modifies any buffers (i.e. anything modified will be written
>> back) and before it next reads the buffers expecting possibly changed data
>> (i.e. invalidating).
> 
> So flush+invalidate ? Keep in mind this driver may not be used on
> ARMv7/v8 only …

Yes, a clean+invalidate.
The flush_dcache_range(…, …) function in U-Boot implements C+I semantics
at least on arm, arm64, avr32, powerpc, xtensa …

Regards,
Philipp.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-04 16:15 ` Marek Vasut
  2017-04-04 17:46   ` Dr. Philipp Tomsich
@ 2017-04-05  8:15   ` Felipe Balbi
  2017-04-05  8:43     ` Marek Vasut
  1 sibling, 1 reply; 17+ messages in thread
From: Felipe Balbi @ 2017-04-05  8:15 UTC (permalink / raw)
  To: u-boot


Hi,

Marek Vasut <marex@denx.de> writes:
>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
>> index b2c7eb1..f58c7ba 100644
>> --- a/drivers/usb/dwc3/core.c
>> +++ b/drivers/usb/dwc3/core.c
>> @@ -125,6 +125,8 @@ static struct dwc3_event_buffer *dwc3_alloc_one_event_buffer(struct dwc3 *dwc,
>>  	if (!evt->buf)
>>  		return ERR_PTR(-ENOMEM);
>>  
>> +	dwc3_flush_cache((long)evt->buf, evt->length);
>> +
>
> Is the length aligned ? If not, you will get cache alignment warning.
> Also, address should be uintptr_t to avoid 32/64 bit issues .

if it's not aligned to 128 bits (at least, IIRC), DWC3 won't work.

>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> index 1156662..61af71b 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -2668,11 +2668,12 @@ void dwc3_gadget_uboot_handle_interrupt(struct dwc3 *dwc)
>>  		int i;
>>  		struct dwc3_event_buffer *evt;
>>  
>> +		dwc3_thread_interrupt(0, dwc);
>> +
>> +		/* Clean + Invalidate the buffers after touching them */
>>  		for (i = 0; i < dwc->num_event_buffers; i++) {
>>  			evt = dwc->ev_buffs[i];
>>  			dwc3_flush_cache((long)evt->buf, evt->length);
>>  		}
>> -
>
> This makes me wonder, don't you need to invalidate the event buffer
> somewhere so that the new data would be fetched from RAM ?

nope. In linux we allocate from coherent

-- 
balbi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-04 19:01     ` Marek Vasut
  2017-04-04 19:56       ` Dr. Philipp Tomsich
@ 2017-04-05  8:18       ` Felipe Balbi
  2017-04-05  8:33         ` Dr. Philipp Tomsich
  2017-04-05  8:43         ` Marek Vasut
  1 sibling, 2 replies; 17+ messages in thread
From: Felipe Balbi @ 2017-04-05  8:18 UTC (permalink / raw)
  To: u-boot


Hi,

Marek Vasut <marex@denx.de> writes:
>>>> Merely using dma_alloc_coherent does not ensure that there is no stale
>>>> data left in the caches for the allocated DMA buffer (i.e. that the
>>>> affected cacheline may still be dirty).
>>>>
>>>> The original code was doing the following (on AArch64, which
>>>> translates a 'flush' into a 'clean + invalidate'):
>>>>  # during initialisation:
>>>>      1. allocate buffers via memalign
>>>>      	 => buffers may still be modified (cached, dirty)
>>>>  # during interrupt processing
>>>>      2. clean + invalidate buffers
>>>>      	 => may commit stale data from a modified cacheline
>>>>      3. read from buffers
>>>>
>>>> This could lead to garbage info being written to buffers before
>>>> reading them during even-processing.
>>>>
>>>> To make the event processing more robust, we use the following sequence
>>>> for the cache-maintenance:
>>>>  # during initialisation:
>>>>      1. allocate buffers via memalign
>>>>      2. clean + invalidate buffers
>>>>      	 (we only need the 'invalidate' part, but dwc3_flush_cache()
>>>> 	  always performs a 'clean + invalidate')
>>>>  # during interrupt processing
>>>>      3. read the buffers
>>>>      	 (we know these lines are not cached, due to the previous
>>>> 	  invalidation and no other code touching them in-between)
>>>>      4. clean + invalidate buffers
>>>>      	 => writes back any modification we may have made during event
>>>> 	    processing and ensures that the lines are not in the cache
>>>> 	    the next time we enter interrupt processing
>>>>
>>>> Note that with the original sequence, we observe reproducible
>>>> (depending on the cache state: i.e. running dhcp/usb start before will
>>>> upset caches to get us around this) issues in the event processing (a
>>>> fatal synchronous abort in dwc3_gadget_uboot_handle_interrupt on the
>>>> first time interrupt handling is invoked) when running USB mass
>>>> storage emulation on our RK3399-Q7 with data-caches on.
>>>>
>>>> Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
>>>>
>>>> ---
>>>>
>>>> drivers/usb/dwc3/core.c   | 2 ++
>>>> drivers/usb/dwc3/gadget.c | 5 +++--
>>>> 2 files changed, 5 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
>>>> index b2c7eb1..f58c7ba 100644
>>>> --- a/drivers/usb/dwc3/core.c
>>>> +++ b/drivers/usb/dwc3/core.c
>>>> @@ -125,6 +125,8 @@ static struct dwc3_event_buffer *dwc3_alloc_one_event_buffer(struct dwc3 *dwc,
>>>> 	if (!evt->buf)
>>>> 		return ERR_PTR(-ENOMEM);
>>>>
>>>> +	dwc3_flush_cache((long)evt->buf, evt->length);
>>>> +
>>>
>>> Is the length aligned ? If not, you will get cache alignment warning.
>>> Also, address should be uintptr_t to avoid 32/64 bit issues .
>> 
>> The length is a well-known value and aligned (it expands to PAGE_SIZE in the end).
>
> Uh, the event buffer is 4k ? That's quite big, but OK.

it really isn't when you're dealing with LPM. I've seen 4k cause
overflow events before.

>> Good point on the “long”, especially as I just copied this from other occurences and it’s consistently wrong throughout DWC3 in U-Boot:
>
> Hrm, I thought the driver was ported over from Linux, so is this broken
> in Linux too ?

haven't seen a problem in almost 6 years dealing with this IP.

-- 
balbi

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-05  8:18       ` Felipe Balbi
@ 2017-04-05  8:33         ` Dr. Philipp Tomsich
  2017-04-05  9:49           ` Felipe Balbi
  2017-04-05  8:43         ` Marek Vasut
  1 sibling, 1 reply; 17+ messages in thread
From: Dr. Philipp Tomsich @ 2017-04-05  8:33 UTC (permalink / raw)
  To: u-boot

Felipe,

> On 05 Apr 2017, at 10:18, Felipe Balbi <felipe.balbi@linux.intel.com> wrote:
> 
>>> Good point on the “long”, especially as I just copied this from other occurences and it’s consistently wrong throughout DWC3 in U-Boot:
>> 
>> Hrm, I thought the driver was ported over from Linux, so is this broken
>> in Linux too ?
> 
> haven't seen a problem in almost 6 years dealing with this IP.

The integer-sizes on the flushing really aren’t a big issue, as everyone runs from the lower 32bits as of today.
And it could easily be another 6 years, before we hit the first 64bit address for any of the buffers being flushed.
Even as the integer types on the dwc3_flush_range are consistently mismatches, that is just a sideshow and
doesn’t cause any issues for anyone.

The big one for us is really the patch submitted to reorder the flushes (i.e. clean+invalidate operations),
as we sometimes (depends both on what happened before that in U-Boot — e.g. using the network
stack will always hide this — and on what configuration we compile into U-Boot) have cachelines
matching the allocation via dma_alloc_coherent either as cached (or possibly even as modified) in our
cache.

Any opinion on changing the sequencing of cache-maintenance relative to the payload?

Regards,
Philipp.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-05  8:15   ` Felipe Balbi
@ 2017-04-05  8:43     ` Marek Vasut
  0 siblings, 0 replies; 17+ messages in thread
From: Marek Vasut @ 2017-04-05  8:43 UTC (permalink / raw)
  To: u-boot

On 04/05/2017 10:15 AM, Felipe Balbi wrote:
> 
> Hi,
> 
> Marek Vasut <marex@denx.de> writes:
>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
>>> index b2c7eb1..f58c7ba 100644
>>> --- a/drivers/usb/dwc3/core.c
>>> +++ b/drivers/usb/dwc3/core.c
>>> @@ -125,6 +125,8 @@ static struct dwc3_event_buffer *dwc3_alloc_one_event_buffer(struct dwc3 *dwc,
>>>  	if (!evt->buf)
>>>  		return ERR_PTR(-ENOMEM);
>>>  
>>> +	dwc3_flush_cache((long)evt->buf, evt->length);
>>> +
>>
>> Is the length aligned ? If not, you will get cache alignment warning.
>> Also, address should be uintptr_t to avoid 32/64 bit issues .
> 
> if it's not aligned to 128 bits (at least, IIRC), DWC3 won't work.

So this is already implicitly aligned, cool.

>>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>>> index 1156662..61af71b 100644
>>> --- a/drivers/usb/dwc3/gadget.c
>>> +++ b/drivers/usb/dwc3/gadget.c
>>> @@ -2668,11 +2668,12 @@ void dwc3_gadget_uboot_handle_interrupt(struct dwc3 *dwc)
>>>  		int i;
>>>  		struct dwc3_event_buffer *evt;
>>>  
>>> +		dwc3_thread_interrupt(0, dwc);
>>> +
>>> +		/* Clean + Invalidate the buffers after touching them */
>>>  		for (i = 0; i < dwc->num_event_buffers; i++) {
>>>  			evt = dwc->ev_buffs[i];
>>>  			dwc3_flush_cache((long)evt->buf, evt->length);
>>>  		}
>>> -
>>
>> This makes me wonder, don't you need to invalidate the event buffer
>> somewhere so that the new data would be fetched from RAM ?
> 
> nope. In linux we allocate from coherent

Got it , thanks.

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-05  8:18       ` Felipe Balbi
  2017-04-05  8:33         ` Dr. Philipp Tomsich
@ 2017-04-05  8:43         ` Marek Vasut
  1 sibling, 0 replies; 17+ messages in thread
From: Marek Vasut @ 2017-04-05  8:43 UTC (permalink / raw)
  To: u-boot

On 04/05/2017 10:18 AM, Felipe Balbi wrote:
> 
> Hi,
> 
> Marek Vasut <marex@denx.de> writes:
>>>>> Merely using dma_alloc_coherent does not ensure that there is no stale
>>>>> data left in the caches for the allocated DMA buffer (i.e. that the
>>>>> affected cacheline may still be dirty).
>>>>>
>>>>> The original code was doing the following (on AArch64, which
>>>>> translates a 'flush' into a 'clean + invalidate'):
>>>>>  # during initialisation:
>>>>>      1. allocate buffers via memalign
>>>>>      	 => buffers may still be modified (cached, dirty)
>>>>>  # during interrupt processing
>>>>>      2. clean + invalidate buffers
>>>>>      	 => may commit stale data from a modified cacheline
>>>>>      3. read from buffers
>>>>>
>>>>> This could lead to garbage info being written to buffers before
>>>>> reading them during even-processing.
>>>>>
>>>>> To make the event processing more robust, we use the following sequence
>>>>> for the cache-maintenance:
>>>>>  # during initialisation:
>>>>>      1. allocate buffers via memalign
>>>>>      2. clean + invalidate buffers
>>>>>      	 (we only need the 'invalidate' part, but dwc3_flush_cache()
>>>>> 	  always performs a 'clean + invalidate')
>>>>>  # during interrupt processing
>>>>>      3. read the buffers
>>>>>      	 (we know these lines are not cached, due to the previous
>>>>> 	  invalidation and no other code touching them in-between)
>>>>>      4. clean + invalidate buffers
>>>>>      	 => writes back any modification we may have made during event
>>>>> 	    processing and ensures that the lines are not in the cache
>>>>> 	    the next time we enter interrupt processing
>>>>>
>>>>> Note that with the original sequence, we observe reproducible
>>>>> (depending on the cache state: i.e. running dhcp/usb start before will
>>>>> upset caches to get us around this) issues in the event processing (a
>>>>> fatal synchronous abort in dwc3_gadget_uboot_handle_interrupt on the
>>>>> first time interrupt handling is invoked) when running USB mass
>>>>> storage emulation on our RK3399-Q7 with data-caches on.
>>>>>
>>>>> Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
>>>>>
>>>>> ---
>>>>>
>>>>> drivers/usb/dwc3/core.c   | 2 ++
>>>>> drivers/usb/dwc3/gadget.c | 5 +++--
>>>>> 2 files changed, 5 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
>>>>> index b2c7eb1..f58c7ba 100644
>>>>> --- a/drivers/usb/dwc3/core.c
>>>>> +++ b/drivers/usb/dwc3/core.c
>>>>> @@ -125,6 +125,8 @@ static struct dwc3_event_buffer *dwc3_alloc_one_event_buffer(struct dwc3 *dwc,
>>>>> 	if (!evt->buf)
>>>>> 		return ERR_PTR(-ENOMEM);
>>>>>
>>>>> +	dwc3_flush_cache((long)evt->buf, evt->length);
>>>>> +
>>>>
>>>> Is the length aligned ? If not, you will get cache alignment warning.
>>>> Also, address should be uintptr_t to avoid 32/64 bit issues .
>>>
>>> The length is a well-known value and aligned (it expands to PAGE_SIZE in the end).
>>
>> Uh, the event buffer is 4k ? That's quite big, but OK.
> 
> it really isn't when you're dealing with LPM. I've seen 4k cause
> overflow events before.

OK

>>> Good point on the “long”, especially as I just copied this from other occurences and it’s consistently wrong throughout DWC3 in U-Boot:
>>
>> Hrm, I thought the driver was ported over from Linux, so is this broken
>> in Linux too ?
> 
> haven't seen a problem in almost 6 years dealing with this IP.

:)

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-05  8:33         ` Dr. Philipp Tomsich
@ 2017-04-05  9:49           ` Felipe Balbi
  0 siblings, 0 replies; 17+ messages in thread
From: Felipe Balbi @ 2017-04-05  9:49 UTC (permalink / raw)
  To: u-boot


Hi,

"Dr. Philipp Tomsich" <philipp.tomsich@theobroma-systems.com> writes:
>>>> Good point on the “long”, especially as I just copied this from other occurences and it’s consistently wrong throughout DWC3 in U-Boot:
>>> 
>>> Hrm, I thought the driver was ported over from Linux, so is this broken
>>> in Linux too ?
>> 
>> haven't seen a problem in almost 6 years dealing with this IP.
>
> The integer-sizes on the flushing really aren’t a big issue, as everyone runs from the lower 32bits as of today.
> And it could easily be another 6 years, before we hit the first 64bit address for any of the buffers being flushed.
> Even as the integer types on the dwc3_flush_range are consistently mismatches, that is just a sideshow and
> doesn’t cause any issues for anyone.
>
> The big one for us is really the patch submitted to reorder the flushes (i.e. clean+invalidate operations),
> as we sometimes (depends both on what happened before that in U-Boot — e.g. using the network
> stack will always hide this — and on what configuration we compile into U-Boot) have cachelines
> matching the allocation via dma_alloc_coherent either as cached (or possibly even as modified) in our
> cache.
>
> Any opinion on changing the sequencing of cache-maintenance relative to the payload?

no opinion, no. We've had one similar issue in linux WRT RNDIS. It was a
very similar situation (cache maintenance was ordered wrongly and ended
up corrupting req->buf).

-- 
balbi
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.denx.de/pipermail/u-boot/attachments/20170405/c3ccbc9d/attachment.sig>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-04 20:26           ` Dr. Philipp Tomsich
@ 2017-04-05 10:25             ` Marek Vasut
  2017-04-05 10:57               ` Dr. Philipp Tomsich
  0 siblings, 1 reply; 17+ messages in thread
From: Marek Vasut @ 2017-04-05 10:25 UTC (permalink / raw)
  To: u-boot

On 04/04/2017 10:26 PM, Dr. Philipp Tomsich wrote:
> 
>> On 04 Apr 2017, at 22:09, Marek Vasut <marex@denx.de> wrote:
>>
>>> The DWC3 flush expands to a clean+invalidate. It is not wrong, as long as
>>> it is used as in my patch:
>>> a. before the first time data is expected to be written by the peripheral (i.e.
>>> before the peripheral is started)—to ensure that the cache line is not cached
>>> any longer…
>>
>> So invalidate() is enough ?
> 
> If I had to write this from scratch, I’d got with the paranoid sequence of:
> 
> 	handler():
> 	{
> 		invalidate
> 		do my stuff
> 		clean
> 	}
> 
> However, some architectures in U-Boot (e.g. ARMv8) don’t implement the
> invalidate verb. Given this, I’d rather stay as close to what’s already there.

invalidate_dcache_range() must be implemented if flush_dcache_range()
is, otherwise it's a bug.

> Note that using flush (i.e. clean+invalidate) aligns with how caches are
> managed throughout various other drivers in U-Boot.
> 
>>
>>> b. after the driver modifies any buffers (i.e. anything modified will be written
>>> back) and before it next reads the buffers expecting possibly changed data
>>> (i.e. invalidating).
>>
>> So flush+invalidate ? Keep in mind this driver may not be used on
>> ARMv7/v8 only …
> 
> Yes, a clean+invalidate.
> The flush_dcache_range(…, …) function in U-Boot implements C+I semantics
> at least on arm, arm64, avr32, powerpc, xtensa …

flush on arm926 does not invalidate the cacheline iirc .

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-05 10:25             ` Marek Vasut
@ 2017-04-05 10:57               ` Dr. Philipp Tomsich
  2017-04-05 11:25                 ` Marek Vasut
  0 siblings, 1 reply; 17+ messages in thread
From: Dr. Philipp Tomsich @ 2017-04-05 10:57 UTC (permalink / raw)
  To: u-boot


> On 05 Apr 2017, at 12:25, Marek Vasut <marex@denx.de> wrote:
> 
> On 04/04/2017 10:26 PM, Dr. Philipp Tomsich wrote:
>> 
>>> On 04 Apr 2017, at 22:09, Marek Vasut <marex@denx.de> wrote:
>>> 
>>>> The DWC3 flush expands to a clean+invalidate. It is not wrong, as long as
>>>> it is used as in my patch:
>>>> a. before the first time data is expected to be written by the peripheral (i.e.
>>>> before the peripheral is started)—to ensure that the cache line is not cached
>>>> any longer…
>>> 
>>> So invalidate() is enough ?
>> 
>> If I had to write this from scratch, I’d got with the paranoid sequence of:
>> 
>> 	handler():
>> 	{
>> 		invalidate
>> 		do my stuff
>> 		clean
>> 	}
>> 
>> However, some architectures in U-Boot (e.g. ARMv8) don’t implement the
>> invalidate verb. Given this, I’d rather stay as close to what’s already there.
> 
> invalidate_dcache_range() must be implemented if flush_dcache_range()
> is, otherwise it's a bug.

The ARMv8 implementation for invalidate currently maps back to a clean+invalidate
(see arch/arm/cpu/armv8/cache_v8.c):

	/*
	 * Invalidates range in all levels of D-cache/unified cache
	 */
	void invalidate_dcache_range(unsigned long start, unsigned long stop)
	{
	        __asm_flush_dcache_range(start, stop);
	}

	/*
	 * Flush range(clean & invalidate) from all levels of D-cache/unified cache
	 */
	void flush_dcache_range(unsigned long start, unsigned long stop)
	{
	        __asm_flush_dcache_range(start, stop);
	}

I am a bit scared of either using (as this clearly is mislabeled) or changing (as
other code might depend on things being as they are) the invalidate-function
for ARMv8 at this point.

>> Note that using flush (i.e. clean+invalidate) aligns with how caches are
>> managed throughout various other drivers in U-Boot.
>> 
>>> 
>>>> b. after the driver modifies any buffers (i.e. anything modified will be written
>>>> back) and before it next reads the buffers expecting possibly changed data
>>>> (i.e. invalidating).
>>> 
>>> So flush+invalidate ? Keep in mind this driver may not be used on
>>> ARMv7/v8 only …
>> 
>> Yes, a clean+invalidate.
>> The flush_dcache_range(…, …) function in U-Boot implements C+I semantics
>> at least on arm, arm64, avr32, powerpc, xtensa …
> 
> flush on arm926 does not invalidate the cacheline iirc .

I dug up an ARMv5 architecture manual (I didn’t think I’d ever need this again) to
look at what the ARM926 does.

Here’s the code for reference:
	void flush_dcache_range(unsigned long start, unsigned long stop)
	{
        	if (!check_cache_range(start, stop))
                	return;

	        while (start < stop) {
        	        asm volatile("mcr p15, 0, %0, c7, c14, 1\n" : : "r"(start));
                	start += CONFIG_SYS_CACHELINE_SIZE;
	        }

        	asm volatile("mcr p15, 0, %0, c7, c10, 4\n" : : "r"(0));
	}

c7 are the cache-management functions, with the following opcodes:
—	“c7, c14, 1” is “Clean and invalidate data cache line” on the modified virtual-address (MVA)
—	“c7, c10, 4” is "Data Synchronization Barrier (formerly Drain Write Buffer)"

This discussion shows that (at some future point… and no, I am not volunteering
for this) the naming of the cache-maintenance functions and the documentation for
them should be reworked to avoid any confusion for the casual driver developer.

I’ll just go ahead and put together a v2 that also addresses the pointer-size concerns,
as I don’t want to have the fix for our DWC3 issue held up by this.


Regards,
Philipp.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
  2017-04-05 10:57               ` Dr. Philipp Tomsich
@ 2017-04-05 11:25                 ` Marek Vasut
  0 siblings, 0 replies; 17+ messages in thread
From: Marek Vasut @ 2017-04-05 11:25 UTC (permalink / raw)
  To: u-boot

On 04/05/2017 12:57 PM, Dr. Philipp Tomsich wrote:
> 
>> On 05 Apr 2017, at 12:25, Marek Vasut <marex@denx.de> wrote:
>>
>> On 04/04/2017 10:26 PM, Dr. Philipp Tomsich wrote:
>>>
>>>> On 04 Apr 2017, at 22:09, Marek Vasut <marex@denx.de> wrote:
>>>>
>>>>> The DWC3 flush expands to a clean+invalidate. It is not wrong, as long as
>>>>> it is used as in my patch:
>>>>> a. before the first time data is expected to be written by the peripheral (i.e.
>>>>> before the peripheral is started)—to ensure that the cache line is not cached
>>>>> any longer…
>>>>
>>>> So invalidate() is enough ?
>>>
>>> If I had to write this from scratch, I’d got with the paranoid sequence of:
>>>
>>> 	handler():
>>> 	{
>>> 		invalidate
>>> 		do my stuff
>>> 		clean
>>> 	}
>>>
>>> However, some architectures in U-Boot (e.g. ARMv8) don’t implement the
>>> invalidate verb. Given this, I’d rather stay as close to what’s already there.
>>
>> invalidate_dcache_range() must be implemented if flush_dcache_range()
>> is, otherwise it's a bug.
> 
> The ARMv8 implementation for invalidate currently maps back to a clean+invalidate
> (see arch/arm/cpu/armv8/cache_v8.c):

Hm, interesting and unusual. OK

> 	/*
> 	 * Invalidates range in all levels of D-cache/unified cache
> 	 */
> 	void invalidate_dcache_range(unsigned long start, unsigned long stop)
> 	{
> 	        __asm_flush_dcache_range(start, stop);
> 	}
> 
> 	/*
> 	 * Flush range(clean & invalidate) from all levels of D-cache/unified cache
> 	 */
> 	void flush_dcache_range(unsigned long start, unsigned long stop)
> 	{
> 	        __asm_flush_dcache_range(start, stop);
> 	}
> 
> I am a bit scared of either using (as this clearly is mislabeled) or changing (as
> other code might depend on things being as they are) the invalidate-function
> for ARMv8 at this point.

The naming is OK, flush == push data from cache one level down ;
invalidate == mark cacheline invalid so data would be loaded from next
level .

>>> Note that using flush (i.e. clean+invalidate) aligns with how caches are
>>> managed throughout various other drivers in U-Boot.
>>>
>>>>
>>>>> b. after the driver modifies any buffers (i.e. anything modified will be written
>>>>> back) and before it next reads the buffers expecting possibly changed data
>>>>> (i.e. invalidating).
>>>>
>>>> So flush+invalidate ? Keep in mind this driver may not be used on
>>>> ARMv7/v8 only …
>>>
>>> Yes, a clean+invalidate.
>>> The flush_dcache_range(…, …) function in U-Boot implements C+I semantics
>>> at least on arm, arm64, avr32, powerpc, xtensa …
>>
>> flush on arm926 does not invalidate the cacheline iirc .
> 
> I dug up an ARMv5 architecture manual (I didn’t think I’d ever need this again) to
> look at what the ARM926 does.
> 
> Here’s the code for reference:
> 	void flush_dcache_range(unsigned long start, unsigned long stop)
> 	{
>         	if (!check_cache_range(start, stop))
>                 	return;
> 
> 	        while (start < stop) {
>         	        asm volatile("mcr p15, 0, %0, c7, c14, 1\n" : : "r"(start));
>                 	start += CONFIG_SYS_CACHELINE_SIZE;
> 	        }
> 
>         	asm volatile("mcr p15, 0, %0, c7, c10, 4\n" : : "r"(0));
> 	}
> 
> c7 are the cache-management functions, with the following opcodes:
> —	“c7, c14, 1” is “Clean and invalidate data cache line” on the modified virtual-address (MVA)
> —	“c7, c10, 4” is "Data Synchronization Barrier (formerly Drain Write Buffer)"
> 
> This discussion shows that (at some future point… and no, I am not volunteering
> for this) the naming of the cache-maintenance functions and the documentation for
> them should be reworked to avoid any confusion for the casual driver developer.
> 
> I’ll just go ahead and put together a v2 that also addresses the pointer-size concerns,
> as I don’t want to have the fix for our DWC3 issue held up by this.

OK

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust
@ 2017-04-08  6:09 mohammadjannati04
  0 siblings, 0 replies; 17+ messages in thread
From: mohammadjannati04 @ 2017-04-08  6:09 UTC (permalink / raw)
  To: u-boot





از گوشی هوشمند Samsung Galaxy ارسال شده است.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-04-08  6:09 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-03 17:49 [U-Boot] [PATCH] usb: dwc3: gadget: make cache-maintenance on event buffers more robust Philipp Tomsich
2017-04-04 16:15 ` Marek Vasut
2017-04-04 17:46   ` Dr. Philipp Tomsich
2017-04-04 19:01     ` Marek Vasut
2017-04-04 19:56       ` Dr. Philipp Tomsich
2017-04-04 20:09         ` Marek Vasut
2017-04-04 20:26           ` Dr. Philipp Tomsich
2017-04-05 10:25             ` Marek Vasut
2017-04-05 10:57               ` Dr. Philipp Tomsich
2017-04-05 11:25                 ` Marek Vasut
2017-04-05  8:18       ` Felipe Balbi
2017-04-05  8:33         ` Dr. Philipp Tomsich
2017-04-05  9:49           ` Felipe Balbi
2017-04-05  8:43         ` Marek Vasut
2017-04-05  8:15   ` Felipe Balbi
2017-04-05  8:43     ` Marek Vasut
2017-04-08  6:09 mohammadjannati04

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.