All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/3] ARM: ITS: implement TODO and fix issues with cache
@ 2023-09-19 11:28 Volodymyr Babchuk
  2023-09-19 11:28 ` [PATCH v1 1/3] ARM: GICv3 ITS: issue INVALL command after mapping host events Volodymyr Babchuk
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Volodymyr Babchuk @ 2023-09-19 11:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Stewart Hildebrand, Volodymyr Babchuk, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

Hello,

There were a couple of issues with GICv3 ITS implementation in
Xen. From user perspective it looks like no interrupts are
delivered. I observed those issues when experimented with SR-IOV on
Renesas S4 board. In my case it wasn't a 100% reproducible issue, so
it took some time and couple of tries to fix it. I wasn't sure if my
fix addressed some hardware quirks of S4 board or it was a generic
solution, so I postponed publishing of it.

Later, Stewart Hildebrand had very simmilar issues with his setup. I
shared those 3 patches with him and they fixed his issue as well. So,
I believe we need those changes in Xen mainline.

Second patch ("ARM: GICv3 ITS: do not invalidate memory while sending
a command") is not strictly required, as it just provides a small
optimization, but I believe it would be nice to have it in the code
base.

Volodymyr Babchuk (3):
  ARM: GICv3 ITS: issue INVALL command after mapping host events
  ARM: GICv3 ITS: do not invalidate memory while sending a command
  ARM: GICv3 ITS: flush all buffers, not just command queue

 xen/arch/arm/gic-v3-its.c             | 27 ++++++++++++++++++++++-----
 xen/arch/arm/include/asm/gic_v3_its.h |  2 +-
 2 files changed, 23 insertions(+), 6 deletions(-)

-- 
2.42.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v1 1/3] ARM: GICv3 ITS: issue INVALL command after mapping host events
  2023-09-19 11:28 [PATCH v1 0/3] ARM: ITS: implement TODO and fix issues with cache Volodymyr Babchuk
@ 2023-09-19 11:28 ` Volodymyr Babchuk
  2023-09-19 12:22   ` Julien Grall
  2023-09-19 11:28 ` [PATCH v1 2/3] ARM: GICv3 ITS: do not invalidate memory while sending a command Volodymyr Babchuk
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Volodymyr Babchuk @ 2023-09-19 11:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Stewart Hildebrand, Volodymyr Babchuk, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

Implement TODO by calling the INVALL command. It is working on real
HW, so there is no sense in not doing this.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
---
 xen/arch/arm/gic-v3-its.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 3aa4edda10..a9c971a55f 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -236,6 +236,19 @@ static int its_send_cmd_inv(struct host_its *its,
     return its_send_command(its, cmd);
 }
 
+static int its_send_cmd_invall(struct host_its *its,
+                               uint32_t collection_id)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_INVALL;
+    cmd[1] = 0x00;
+    cmd[2] = collection_id;
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
 /* Set up the (1:1) collection mapping for the given host CPU. */
 int gicv3_its_setup_collection(unsigned int cpu)
 {
@@ -593,7 +606,9 @@ static int gicv3_its_map_host_events(struct host_its *its,
             return ret;
     }
 
-    /* TODO: Consider using INVALL here. Didn't work on the model, though. */
+    ret = its_send_cmd_invall(its, 0);
+    if ( ret )
+        return ret;
 
     ret = its_send_cmd_sync(its, 0);
     if ( ret )
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 3/3] ARM: GICv3 ITS: flush all buffers, not just command queue
  2023-09-19 11:28 [PATCH v1 0/3] ARM: ITS: implement TODO and fix issues with cache Volodymyr Babchuk
  2023-09-19 11:28 ` [PATCH v1 1/3] ARM: GICv3 ITS: issue INVALL command after mapping host events Volodymyr Babchuk
  2023-09-19 11:28 ` [PATCH v1 2/3] ARM: GICv3 ITS: do not invalidate memory while sending a command Volodymyr Babchuk
@ 2023-09-19 11:28 ` Volodymyr Babchuk
  2023-09-19 13:10   ` Julien Grall
  2023-09-19 19:07 ` [PATCH v1 0/3] ARM: ITS: implement TODO and fix issues with cache Stewart Hildebrand
  3 siblings, 1 reply; 15+ messages in thread
From: Volodymyr Babchuk @ 2023-09-19 11:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Stewart Hildebrand, Volodymyr Babchuk, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

ITS manages Device Tables and Interrupt Translation Tables on its own,
so generally we are not interested which shareability and cacheability
attributes it uses. But there is one exception: ITS requires that DT
and ITT must be initialized with zeroes. If ITS belongs to the Inner
Cacheability domain there is no problem at all.

But in all other cases we need to do clean CPU caches manually, or
otherwise CPU can overwrite DT and ITT entries. From user perspective
this looks like interrupts are not delivered from a device.

Also, we will rename HOST_ITS_FLUSH_CMD_QUEUE flag to
HOST_ITS_FLUSH_BUFFERS because now this flag controls not only command
queue.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
---
 xen/arch/arm/gic-v3-its.c             | 7 +++++--
 xen/arch/arm/include/asm/gic_v3_its.h | 2 +-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 72cf318810..63e28a7706 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -107,7 +107,7 @@ static int its_send_command(struct host_its *hw_its, const void *its_cmd)
     }
 
     memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_CMD_SIZE);
-    if ( hw_its->flags & HOST_ITS_FLUSH_CMD_QUEUE )
+    if ( hw_its->flags & HOST_ITS_FLUSH_BUFFERS )
         clean_dcache_va_range(hw_its->cmd_buf + writep, ITS_CMD_SIZE);
     else
         dsb(ishst);
@@ -335,7 +335,7 @@ static void *its_map_cbaser(struct host_its *its)
      */
     if ( !(reg & GITS_BASER_INNER_CACHEABILITY_MASK) )
     {
-        its->flags |= HOST_ITS_FLUSH_CMD_QUEUE;
+        its->flags |= HOST_ITS_FLUSH_BUFFERS;
         printk(XENLOG_WARNING "using non-cacheable ITS command queue\n");
     }
 
@@ -699,6 +699,9 @@ int gicv3_its_map_guest_device(struct domain *d,
     if ( !itt_addr )
         goto out_unlock;
 
+    if ( hw_its->flags & HOST_ITS_FLUSH_BUFFERS )
+        clean_dcache_va_range(itt_addr, nr_events * hw_its->itte_size);
+
     dev = xzalloc(struct its_device);
     if ( !dev )
         goto out_unlock;
diff --git a/xen/arch/arm/include/asm/gic_v3_its.h b/xen/arch/arm/include/asm/gic_v3_its.h
index c24d4752d0..460b008db5 100644
--- a/xen/arch/arm/include/asm/gic_v3_its.h
+++ b/xen/arch/arm/include/asm/gic_v3_its.h
@@ -107,7 +107,7 @@
 #include <xen/device_tree.h>
 #include <xen/rbtree.h>
 
-#define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
+#define HOST_ITS_FLUSH_BUFFERS          (1U << 0)
 #define HOST_ITS_USES_PTA               (1U << 1)
 
 /* We allocate LPIs on the hosts in chunks of 32 to reduce handling overhead. */
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v1 2/3] ARM: GICv3 ITS: do not invalidate memory while sending a command
  2023-09-19 11:28 [PATCH v1 0/3] ARM: ITS: implement TODO and fix issues with cache Volodymyr Babchuk
  2023-09-19 11:28 ` [PATCH v1 1/3] ARM: GICv3 ITS: issue INVALL command after mapping host events Volodymyr Babchuk
@ 2023-09-19 11:28 ` Volodymyr Babchuk
  2023-09-19 13:02   ` Julien Grall
  2023-09-19 11:28 ` [PATCH v1 3/3] ARM: GICv3 ITS: flush all buffers, not just command queue Volodymyr Babchuk
  2023-09-19 19:07 ` [PATCH v1 0/3] ARM: ITS: implement TODO and fix issues with cache Stewart Hildebrand
  3 siblings, 1 reply; 15+ messages in thread
From: Volodymyr Babchuk @ 2023-09-19 11:28 UTC (permalink / raw)
  To: xen-devel
  Cc: Stewart Hildebrand, Volodymyr Babchuk, Stefano Stabellini,
	Julien Grall, Bertrand Marquis, Volodymyr Babchuk

There is no need to invalidate cache entry because we just wrote into a
memory region. Writing itself guarantees that cache entry is valid.

But we still need to flush cache line to be sure that ITS sees a
command written into a queue.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
---
 xen/arch/arm/gic-v3-its.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index a9c971a55f..72cf318810 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -108,8 +108,7 @@ static int its_send_command(struct host_its *hw_its, const void *its_cmd)
 
     memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_CMD_SIZE);
     if ( hw_its->flags & HOST_ITS_FLUSH_CMD_QUEUE )
-        clean_and_invalidate_dcache_va_range(hw_its->cmd_buf + writep,
-                                             ITS_CMD_SIZE);
+        clean_dcache_va_range(hw_its->cmd_buf + writep, ITS_CMD_SIZE);
     else
         dsb(ishst);
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 1/3] ARM: GICv3 ITS: issue INVALL command after mapping host events
  2023-09-19 11:28 ` [PATCH v1 1/3] ARM: GICv3 ITS: issue INVALL command after mapping host events Volodymyr Babchuk
@ 2023-09-19 12:22   ` Julien Grall
  0 siblings, 0 replies; 15+ messages in thread
From: Julien Grall @ 2023-09-19 12:22 UTC (permalink / raw)
  To: Volodymyr Babchuk, xen-devel
  Cc: Stewart Hildebrand, Stefano Stabellini, Bertrand Marquis

Hi Volodymyr,

On 19/09/2023 12:28, Volodymyr Babchuk wrote:
> Implement TODO by calling the INVALL command.

I think the TODO was meant to be an optimization because AFAICT we are 
already sending an INV command per event.

>  It is working on real
> HW, so there is no sense in not doing this.

A patch should be justified based from the spec or an errata. Not that 
fact it works on a real HW. At the moment, I don't quite understand why 
you need one because as said above, we are technically already sending 
an INV per event so we should be covered.

Removing the INV and using INVALL would make sense as an optimization. 
Yet given the current code doesn't seem to work, I would like to 
understand what's the problem of using INV.

> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> ---
>   xen/arch/arm/gic-v3-its.c | 17 ++++++++++++++++-
>   1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 3aa4edda10..a9c971a55f 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -236,6 +236,19 @@ static int its_send_cmd_inv(struct host_its *its,
>       return its_send_command(its, cmd);
>   }
>   
> +static int its_send_cmd_invall(struct host_its *its,
> +                               uint32_t collection_id)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_INVALL;
> +    cmd[1] = 0x00;
> +    cmd[2] = collection_id;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>   /* Set up the (1:1) collection mapping for the given host CPU. */
>   int gicv3_its_setup_collection(unsigned int cpu)
>   {
> @@ -593,7 +606,9 @@ static int gicv3_its_map_host_events(struct host_its *its,
>               return ret;
>       }
>   
> -    /* TODO: Consider using INVALL here. Didn't work on the model, though. */
> +    ret = its_send_cmd_invall(its, 0);
> +    if ( ret )
> +        return ret;
>   
>       ret = its_send_cmd_sync(its, 0);
>       if ( ret )

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/3] ARM: GICv3 ITS: do not invalidate memory while sending a command
  2023-09-19 11:28 ` [PATCH v1 2/3] ARM: GICv3 ITS: do not invalidate memory while sending a command Volodymyr Babchuk
@ 2023-09-19 13:02   ` Julien Grall
  2023-09-19 14:36     ` Volodymyr Babchuk
  0 siblings, 1 reply; 15+ messages in thread
From: Julien Grall @ 2023-09-19 13:02 UTC (permalink / raw)
  To: Volodymyr Babchuk, xen-devel
  Cc: Stewart Hildebrand, Stefano Stabellini, Bertrand Marquis

Hi Volodymyr,

On 19/09/2023 12:28, Volodymyr Babchuk wrote:
> There is no need to invalidate cache entry because we just wrote into a
> memory region. Writing itself guarantees that cache entry is valid.

The goal of invalidate is to remove the line from the cache. So I don't 
quite understand the reasoning here.

So I am a little hesitant to remove the invalidate without some 
justification based on the specification.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 3/3] ARM: GICv3 ITS: flush all buffers, not just command queue
  2023-09-19 11:28 ` [PATCH v1 3/3] ARM: GICv3 ITS: flush all buffers, not just command queue Volodymyr Babchuk
@ 2023-09-19 13:10   ` Julien Grall
  2023-09-19 13:14     ` Julien Grall
  0 siblings, 1 reply; 15+ messages in thread
From: Julien Grall @ 2023-09-19 13:10 UTC (permalink / raw)
  To: Volodymyr Babchuk, xen-devel
  Cc: Stewart Hildebrand, Stefano Stabellini, Bertrand Marquis

Hi Volodymyr,

On 19/09/2023 12:28, Volodymyr Babchuk wrote:
> ITS manages Device Tables and Interrupt Translation Tables on its own,
> so generally we are not interested which shareability and cacheability
> attributes it uses. But there is one exception: ITS requires that DT
> and ITT must be initialized with zeroes. If ITS belongs to the Inner
> Cacheability domain there is no problem at all.
> 
> But in all other cases we need to do clean CPU caches manually, or
> otherwise CPU can overwrite DT and ITT entries. From user perspective
> this looks like interrupts are not delivered from a device.
> 
> Also, we will rename HOST_ITS_FLUSH_CMD_QUEUE flag to
> HOST_ITS_FLUSH_BUFFERS because now this flag controls not only command
> queue.

Reading the specification, CBASER will indicate the cacheability for the 
command queue. But I couldn't find any reference saying this will apply 
to the ITT as well.

If such reference doesn't exist then...

> 
> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> ---
>   xen/arch/arm/gic-v3-its.c             | 7 +++++--
>   xen/arch/arm/include/asm/gic_v3_its.h | 2 +-
>   2 files changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 72cf318810..63e28a7706 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -107,7 +107,7 @@ static int its_send_command(struct host_its *hw_its, const void *its_cmd)
>       }
>   
>       memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_CMD_SIZE);
> -    if ( hw_its->flags & HOST_ITS_FLUSH_CMD_QUEUE )
> +    if ( hw_its->flags & HOST_ITS_FLUSH_BUFFERS )
>           clean_dcache_va_range(hw_its->cmd_buf + writep, ITS_CMD_SIZE);
>       else
>           dsb(ishst);
> @@ -335,7 +335,7 @@ static void *its_map_cbaser(struct host_its *its)
>        */
>       if ( !(reg & GITS_BASER_INNER_CACHEABILITY_MASK) )
>       {
> -        its->flags |= HOST_ITS_FLUSH_CMD_QUEUE;
> +        its->flags |= HOST_ITS_FLUSH_BUFFERS;
>           printk(XENLOG_WARNING "using non-cacheable ITS command queue\n");
>       }
>   
> @@ -699,6 +699,9 @@ int gicv3_its_map_guest_device(struct domain *d,
>       if ( !itt_addr )
>           goto out_unlock;
>   
> +    if ( hw_its->flags & HOST_ITS_FLUSH_BUFFERS )
> +        clean_dcache_va_range(itt_addr, nr_events * hw_its->itte_size);

... I think we need to have this flush unconditional like Linux does.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 3/3] ARM: GICv3 ITS: flush all buffers, not just command queue
  2023-09-19 13:10   ` Julien Grall
@ 2023-09-19 13:14     ` Julien Grall
  2023-09-19 19:17       ` Stewart Hildebrand
  0 siblings, 1 reply; 15+ messages in thread
From: Julien Grall @ 2023-09-19 13:14 UTC (permalink / raw)
  To: Volodymyr Babchuk, xen-devel
  Cc: Stewart Hildebrand, Stefano Stabellini, Bertrand Marquis

Hi,

On 19/09/2023 14:10, Julien Grall wrote:
> On 19/09/2023 12:28, Volodymyr Babchuk wrote:
>> ITS manages Device Tables and Interrupt Translation Tables on its own,
>> so generally we are not interested which shareability and cacheability
>> attributes it uses. But there is one exception: ITS requires that DT
>> and ITT must be initialized with zeroes. If ITS belongs to the Inner
>> Cacheability domain there is no problem at all.
>>
>> But in all other cases we need to do clean CPU caches manually, or
>> otherwise CPU can overwrite DT and ITT entries. From user perspective
>> this looks like interrupts are not delivered from a device.
>>
>> Also, we will rename HOST_ITS_FLUSH_CMD_QUEUE flag to
>> HOST_ITS_FLUSH_BUFFERS because now this flag controls not only command
>> queue.
> 
> Reading the specification, CBASER will indicate the cacheability for the 
> command queue. But I couldn't find any reference saying this will apply 
> to the ITT as well.
> 
> If such reference doesn't exist then...
> 
>>
>> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
>> ---
>>   xen/arch/arm/gic-v3-its.c             | 7 +++++--
>>   xen/arch/arm/include/asm/gic_v3_its.h | 2 +-
>>   2 files changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
>> index 72cf318810..63e28a7706 100644
>> --- a/xen/arch/arm/gic-v3-its.c
>> +++ b/xen/arch/arm/gic-v3-its.c
>> @@ -107,7 +107,7 @@ static int its_send_command(struct host_its 
>> *hw_its, const void *its_cmd)
>>       }
>>       memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_CMD_SIZE);
>> -    if ( hw_its->flags & HOST_ITS_FLUSH_CMD_QUEUE )
>> +    if ( hw_its->flags & HOST_ITS_FLUSH_BUFFERS )
>>           clean_dcache_va_range(hw_its->cmd_buf + writep, ITS_CMD_SIZE);
>>       else
>>           dsb(ishst);
>> @@ -335,7 +335,7 @@ static void *its_map_cbaser(struct host_its *its)
>>        */
>>       if ( !(reg & GITS_BASER_INNER_CACHEABILITY_MASK) )
>>       {
>> -        its->flags |= HOST_ITS_FLUSH_CMD_QUEUE;
>> +        its->flags |= HOST_ITS_FLUSH_BUFFERS;
>>           printk(XENLOG_WARNING "using non-cacheable ITS command 
>> queue\n");
>>       }
>> @@ -699,6 +699,9 @@ int gicv3_its_map_guest_device(struct domain *d,
>>       if ( !itt_addr )
>>           goto out_unlock;
>> +    if ( hw_its->flags & HOST_ITS_FLUSH_BUFFERS )
>> +        clean_dcache_va_range(itt_addr, nr_events * hw_its->itte_size);
> 
> ... I think we need to have this flush unconditional like Linux does.

I forgot to mention that this likely want to be a 
clean_dcache_and_invalidate_va_range() (see my comment in patch #2).

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/3] ARM: GICv3 ITS: do not invalidate memory while sending a command
  2023-09-19 13:02   ` Julien Grall
@ 2023-09-19 14:36     ` Volodymyr Babchuk
  2023-09-19 15:19       ` Julien Grall
  0 siblings, 1 reply; 15+ messages in thread
From: Volodymyr Babchuk @ 2023-09-19 14:36 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Stewart Hildebrand, Stefano Stabellini, Bertrand Marquis


Hi Julien,

Julien Grall <julien@xen.org> writes:

> Hi Volodymyr,
>
> On 19/09/2023 12:28, Volodymyr Babchuk wrote:
>> There is no need to invalidate cache entry because we just wrote into a
>> memory region. Writing itself guarantees that cache entry is valid.
>
> The goal of invalidate is to remove the line from the cache. So I
> don't quite understand the reasoning here.
>

Well, I may be wrong, but what is the goal in removing line from the
cache? As I see this, we want to be sure that ITS sees data written in
the memory, so we should flush a cache line. But why do we need to
remove it from CPU's cache?

-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/3] ARM: GICv3 ITS: do not invalidate memory while sending a command
  2023-09-19 14:36     ` Volodymyr Babchuk
@ 2023-09-19 15:19       ` Julien Grall
  2023-09-22  0:22         ` Volodymyr Babchuk
  0 siblings, 1 reply; 15+ messages in thread
From: Julien Grall @ 2023-09-19 15:19 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: xen-devel, Stewart Hildebrand, Stefano Stabellini, Bertrand Marquis

On 19/09/2023 15:36, Volodymyr Babchuk wrote:
> Julien Grall <julien@xen.org> writes:
>> On 19/09/2023 12:28, Volodymyr Babchuk wrote:
>>> There is no need to invalidate cache entry because we just wrote into a
>>> memory region. Writing itself guarantees that cache entry is valid.
>>
>> The goal of invalidate is to remove the line from the cache. So I
>> don't quite understand the reasoning here.
>>
> 
> Well, I may be wrong, but what is the goal in removing line from the
> cache? As I see this, we want to be sure that ITS sees data written in
> the memory, so we should flush a cache line. But why do we need to
> remove it from CPU's cache?

I don't exactly know. From a brief look I agree with you. However, our 
driver is based on Linux where the clean & invalidate is also used. So I 
am a little be cautious to remove it.

The way forward would be to ask the Marc Zyngier (GICv3 maintainer) why 
it was added in Linux. Then we can decide whether this can be removed in 
Xen.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 0/3] ARM: ITS: implement TODO and fix issues with cache
  2023-09-19 11:28 [PATCH v1 0/3] ARM: ITS: implement TODO and fix issues with cache Volodymyr Babchuk
                   ` (2 preceding siblings ...)
  2023-09-19 11:28 ` [PATCH v1 3/3] ARM: GICv3 ITS: flush all buffers, not just command queue Volodymyr Babchuk
@ 2023-09-19 19:07 ` Stewart Hildebrand
  3 siblings, 0 replies; 15+ messages in thread
From: Stewart Hildebrand @ 2023-09-19 19:07 UTC (permalink / raw)
  To: Volodymyr Babchuk, xen-devel
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis

On 9/19/23 07:28, Volodymyr Babchuk wrote:
> Hello,
> 
> There were a couple of issues with GICv3 ITS implementation in
> Xen. From user perspective it looks like no interrupts are
> delivered. I observed those issues when experimented with SR-IOV on
> Renesas S4 board. In my case it wasn't a 100% reproducible issue, so
> it took some time and couple of tries to fix it. I wasn't sure if my
> fix addressed some hardware quirks of S4 board or it was a generic
> solution, so I postponed publishing of it.
> 
> Later, Stewart Hildebrand had very simmilar issues with his setup. I
> shared those 3 patches with him and they fixed his issue as well. So,
> I believe we need those changes in Xen mainline.
> 
> Second patch ("ARM: GICv3 ITS: do not invalidate memory while sending
> a command") is not strictly required, as it just provides a small
> optimization, but I believe it would be nice to have it in the code
> base.

I did a bit more experimentation, and the first patch ("ARM: GICv3 ITS: issue INVALL command after mapping host events") is not strictly required either for my particular test case. But the third one ("ARM: GICv3 ITS: flush all buffers, not just command queue") indeed appears to fix a real bug.

> 
> Volodymyr Babchuk (3):
>   ARM: GICv3 ITS: issue INVALL command after mapping host events
>   ARM: GICv3 ITS: do not invalidate memory while sending a command
>   ARM: GICv3 ITS: flush all buffers, not just command queue
> 

For the curious, here are a few more details about my test case. While testing the ("SMMU handling for PCIe Passthrough on ARM") [1] series on an AMD Versal VCK190 board, I discovered an issue with MSIs in dom0. The driver for the PCIe device has multiple IRQs, but only one of them was being raised in dom0: nvme0q0 was working, but not nvme0q1/2:

xilinx-vck190-20231:~$ cat /proc/interrupts
           CPU0       CPU1
  0:          0         92   xen-dyn     Edge    -event     xenbus
 11:      18084      11575     GICv3  27 Level     arch_timer
 12:         61         90     GICv3  16 Level     events
 13:          0          0     GICv3  62 Level     zynqmp_ipi
 15:          0          0  RC-Event   0 Level     LINK_DOWN
 16:          0          0  RC-Event   3 Level     HOT_RESET
 17:          0          0  RC-Event   4 Level     CFG_PCIE_TIMEOUT
 18:          0          0  RC-Event   8 Level     CFG_TIMEOUT
 19:          0          0  RC-Event   9 Level     CORRECTABLE
 20:          0          0  RC-Event  10 Level     NONFATAL
 21:          0          0  RC-Event  11 Level     FATAL
 22:          0          0  RC-Event  12 Level     CFG_ERR_POISON
 23:          0          0  RC-Event  15 Level     PME_TO_ACK_RCVD
 24:          0          0  RC-Event  17 Level     PM_PME_RCVD
 25:          0          0  RC-Event  20 Level     SLV_UNSUPP
 26:          0          0  RC-Event  21 Level     SLV_UNEXP
 27:          0          0  RC-Event  22 Level     SLV_COMPL
 28:          0          0  RC-Event  23 Level     SLV_ERRP
 29:          0          0  RC-Event  24 Level     SLV_CMPABT
 30:          0          0  RC-Event  25 Level     SLV_ILLBUR
 31:          0          0  RC-Event  26 Level     MST_DECERR
 32:          0          0  RC-Event  27 Level     MST_SLVERR
 33:          0          0  RC-Event  28 Level     SLV_PCIE_TIMEOUT
 35:         41          0   xen-dyn     Edge    -virq      hvc_console
 37:         17          0   ITS-MSI 524288 Edge      nvme0q0
 38:          0          0     GICv3 176 Level     sysmon-irq
 40:          0          0   ITS-MSI 524289 Edge      nvme0q1
 41:          0          0   ITS-MSI 524290 Edge      nvme0q2
 42:          0          0  xen-dyn-lateeoi     Edge    -event     evtchn:xenstored
 43:         19          0  xen-dyn-lateeoi     Edge    -event     evtchn:xenstored
IPI0:        30        101       Rescheduling interrupts
IPI1:      1888       1458       Function call interrupts
IPI2:         0          0       CPU stop interrupts
IPI3:         0          0       CPU stop (for crash dump) interrupts
IPI4:         0          0       Timer broadcast interrupts
IPI5:         0          0       IRQ work interrupts
IPI6:         0          0       CPU wake-up interrupts
Err:          0

After applying the patch ("ARM: GICv3 ITS: flush all buffers, not just command queue") all the ITS-MSI irqs work:

xilinx-vck190-20231:~$ cat /proc/interrupts
           CPU0       CPU1
  0:          0         94   xen-dyn     Edge    -event     xenbus
 11:       4928       3938     GICv3  27 Level     arch_timer
 12:         56         95     GICv3  16 Level     events
 13:          0          0     GICv3  62 Level     zynqmp_ipi
 15:          0          0  RC-Event   0 Level     LINK_DOWN
 16:          0          0  RC-Event   3 Level     HOT_RESET
 17:          0          0  RC-Event   4 Level     CFG_PCIE_TIMEOUT
 18:          0          0  RC-Event   8 Level     CFG_TIMEOUT
 19:          0          0  RC-Event   9 Level     CORRECTABLE
 20:          0          0  RC-Event  10 Level     NONFATAL
 21:          0          0  RC-Event  11 Level     FATAL
 22:          0          0  RC-Event  12 Level     CFG_ERR_POISON
 23:          0          0  RC-Event  15 Level     PME_TO_ACK_RCVD
 24:          0          0  RC-Event  17 Level     PM_PME_RCVD
 25:          0          0  RC-Event  20 Level     SLV_UNSUPP
 26:          0          0  RC-Event  21 Level     SLV_UNEXP
 27:          0          0  RC-Event  22 Level     SLV_COMPL
 28:          0          0  RC-Event  23 Level     SLV_ERRP
 29:          0          0  RC-Event  24 Level     SLV_CMPABT
 30:          0          0  RC-Event  25 Level     SLV_ILLBUR
 31:          0          0  RC-Event  26 Level     MST_DECERR
 32:          0          0  RC-Event  27 Level     MST_SLVERR
 33:          0          0  RC-Event  28 Level     SLV_PCIE_TIMEOUT
 35:         42          0   xen-dyn     Edge    -virq      hvc_console
 37:         10          0   ITS-MSI 524288 Edge      nvme0q0
 38:         48          0   ITS-MSI 524289 Edge      nvme0q1
 39:          0         66   ITS-MSI 524290 Edge      nvme0q2
 40:          0          0     GICv3 176 Level     sysmon-irq
 42:          0          0  xen-dyn-lateeoi     Edge    -event     evtchn:xenstored
 43:         13          0  xen-dyn-lateeoi     Edge    -event     evtchn:xenstored
IPI0:        78         77       Rescheduling interrupts
IPI1:      2513       2512       Function call interrupts
IPI2:         0          0       CPU stop interrupts
IPI3:         0          0       CPU stop (for crash dump) interrupts
IPI4:         0          0       Timer broadcast interrupts
IPI5:         0          0       IRQ work interrupts
IPI6:         0          0       CPU wake-up interrupts
Err:          0

[1] https://lists.xenproject.org/archives/html/xen-devel/2023-06/msg00353.html


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 3/3] ARM: GICv3 ITS: flush all buffers, not just command queue
  2023-09-19 13:14     ` Julien Grall
@ 2023-09-19 19:17       ` Stewart Hildebrand
  0 siblings, 0 replies; 15+ messages in thread
From: Stewart Hildebrand @ 2023-09-19 19:17 UTC (permalink / raw)
  To: Julien Grall, Volodymyr Babchuk, xen-devel
  Cc: Stefano Stabellini, Bertrand Marquis

On 9/19/23 09:14, Julien Grall wrote:
> Hi,
> 
> On 19/09/2023 14:10, Julien Grall wrote:
>> On 19/09/2023 12:28, Volodymyr Babchuk wrote:
>>> ITS manages Device Tables and Interrupt Translation Tables on its own,
>>> so generally we are not interested which shareability and cacheability
>>> attributes it uses. But there is one exception: ITS requires that DT
>>> and ITT must be initialized with zeroes. If ITS belongs to the Inner
>>> Cacheability domain there is no problem at all.
>>>
>>> But in all other cases we need to do clean CPU caches manually, or
>>> otherwise CPU can overwrite DT and ITT entries. From user perspective
>>> this looks like interrupts are not delivered from a device.
>>>
>>> Also, we will rename HOST_ITS_FLUSH_CMD_QUEUE flag to
>>> HOST_ITS_FLUSH_BUFFERS because now this flag controls not only command
>>> queue.
>>
>> Reading the specification, CBASER will indicate the cacheability for the
>> command queue. But I couldn't find any reference saying this will apply
>> to the ITT as well.
>>
>> If such reference doesn't exist then...
>>
>>>
>>> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
>>> ---
>>>   xen/arch/arm/gic-v3-its.c             | 7 +++++--
>>>   xen/arch/arm/include/asm/gic_v3_its.h | 2 +-
>>>   2 files changed, 6 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
>>> index 72cf318810..63e28a7706 100644
>>> --- a/xen/arch/arm/gic-v3-its.c
>>> +++ b/xen/arch/arm/gic-v3-its.c
>>> @@ -107,7 +107,7 @@ static int its_send_command(struct host_its
>>> *hw_its, const void *its_cmd)
>>>       }
>>>       memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_CMD_SIZE);
>>> -    if ( hw_its->flags & HOST_ITS_FLUSH_CMD_QUEUE )
>>> +    if ( hw_its->flags & HOST_ITS_FLUSH_BUFFERS )
>>>           clean_dcache_va_range(hw_its->cmd_buf + writep, ITS_CMD_SIZE);
>>>       else
>>>           dsb(ishst);
>>> @@ -335,7 +335,7 @@ static void *its_map_cbaser(struct host_its *its)
>>>        */
>>>       if ( !(reg & GITS_BASER_INNER_CACHEABILITY_MASK) )
>>>       {
>>> -        its->flags |= HOST_ITS_FLUSH_CMD_QUEUE;
>>> +        its->flags |= HOST_ITS_FLUSH_BUFFERS;
>>>           printk(XENLOG_WARNING "using non-cacheable ITS command
>>> queue\n");
>>>       }
>>> @@ -699,6 +699,9 @@ int gicv3_its_map_guest_device(struct domain *d,
>>>       if ( !itt_addr )
>>>           goto out_unlock;
>>> +    if ( hw_its->flags & HOST_ITS_FLUSH_BUFFERS )
>>> +        clean_dcache_va_range(itt_addr, nr_events * hw_its->itte_size);
>>
>> ... I think we need to have this flush unconditional like Linux does.
> 
> I forgot to mention that this likely want to be a
> clean_dcache_and_invalidate_va_range() (see my comment in patch #2).

I turned it into an unconditional clean_and_invalidate_dcache_va_range() and did a quick test, and it still fixes my test case on VCK190. So feel free to add:

Tested-by: Stewart Hildebrand <stewart.hildebrand@amd.com>


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/3] ARM: GICv3 ITS: do not invalidate memory while sending a command
  2023-09-19 15:19       ` Julien Grall
@ 2023-09-22  0:22         ` Volodymyr Babchuk
  2023-09-22 10:31           ` Marc Zyngier
  0 siblings, 1 reply; 15+ messages in thread
From: Volodymyr Babchuk @ 2023-09-22  0:22 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: xen-devel, Stewart Hildebrand, Stefano Stabellini,
	Bertrand Marquis, Julien Grall


Hi Mark,

I am writing to you, because you are GICv3 maintainer in Linux. We are
updating ITS driver in Xen and we have a question about cache
maintenance WRT memory shared with ITS. As I can see, the Linux ITS
driver uses gic_flush_dcache_to_poc() all over the code. This boils down
to "dc civac" instruction which does both clean and invalidate. But do
we really need to invalidate a cache when we are sending an ITS command?
In my understanding it is sufficient to clean a cache only and Linux
uses clean&invalidate just out of convenience. Is this correct?

Below you can find our discussion about this.

Julien Grall <julien@xen.org> writes:

> On 19/09/2023 15:36, Volodymyr Babchuk wrote:
>> Julien Grall <julien@xen.org> writes:
>>> On 19/09/2023 12:28, Volodymyr Babchuk wrote:
>>>> There is no need to invalidate cache entry because we just wrote into a
>>>> memory region. Writing itself guarantees that cache entry is valid.
>>>
>>> The goal of invalidate is to remove the line from the cache. So I
>>> don't quite understand the reasoning here.
>>>
>> Well, I may be wrong, but what is the goal in removing line from the
>> cache? As I see this, we want to be sure that ITS sees data written in
>> the memory, so we should flush a cache line. But why do we need to
>> remove it from CPU's cache?
>
> I don't exactly know. From a brief look I agree with you. However, our
> driver is based on Linux where the clean & invalidate is also used. So
> I am a little be cautious to remove it.
>
> The way forward would be to ask the Marc Zyngier (GICv3 maintainer)
> why it was added in Linux. Then we can decide whether this can be
> removed in Xen.
>
> Cheers,


-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/3] ARM: GICv3 ITS: do not invalidate memory while sending a command
  2023-09-22  0:22         ` Volodymyr Babchuk
@ 2023-09-22 10:31           ` Marc Zyngier
  2023-09-22 22:02             ` Volodymyr Babchuk
  0 siblings, 1 reply; 15+ messages in thread
From: Marc Zyngier @ 2023-09-22 10:31 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: xen-devel, Stewart Hildebrand, Stefano Stabellini,
	Bertrand Marquis, Julien Grall

Volodymyr,

On Fri, 22 Sep 2023 01:22:11 +0100,
Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> wrote:
> 
> 
> Hi Mark,

s/k/c/

> 
> I am writing to you, because you are GICv3 maintainer in Linux. We are
> updating ITS driver in Xen and we have a question about cache
> maintenance WRT memory shared with ITS. As I can see, the Linux ITS
> driver uses gic_flush_dcache_to_poc() all over the code. This boils down
> to "dc civac" instruction which does both clean and invalidate. But do
> we really need to invalidate a cache when we are sending an ITS command?
> In my understanding it is sufficient to clean a cache only and Linux
> uses clean&invalidate just out of convenience. Is this correct?

It really depends how you look at it. We use DC CIVA as the standard
way to give a buffer to a device, as that's what the DMA API
does. Switching to a simple clean is possible, but I don't really see
what it brings you.

ITS commands are usually written as a single command followed by a
SYNC/VSYNC. That's a total a 8 64bit words, which makes a cache line
on 99.999% of the implementations.

What do you gain by keeping the cache line around? Not much. By the
time you go around the command queue and need the same data again, it
will have been evicted from your L1 already.

So while I don't see a problem with what you are suggesting, I also
think the change is pretty much irrelevant.

HTH,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v1 2/3] ARM: GICv3 ITS: do not invalidate memory while sending a command
  2023-09-22 10:31           ` Marc Zyngier
@ 2023-09-22 22:02             ` Volodymyr Babchuk
  0 siblings, 0 replies; 15+ messages in thread
From: Volodymyr Babchuk @ 2023-09-22 22:02 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: xen-devel, Stewart Hildebrand, Stefano Stabellini,
	Bertrand Marquis, Julien Grall


Hello Marc,

Marc Zyngier <maz@kernel.org> writes:

> Volodymyr,
>
> On Fri, 22 Sep 2023 01:22:11 +0100,
> Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> wrote:
>> 
>> 
>> Hi Mark,
>
> s/k/c/

Oh, I'm sorry.

>> 
>> I am writing to you, because you are GICv3 maintainer in Linux. We are
>> updating ITS driver in Xen and we have a question about cache
>> maintenance WRT memory shared with ITS. As I can see, the Linux ITS
>> driver uses gic_flush_dcache_to_poc() all over the code. This boils down
>> to "dc civac" instruction which does both clean and invalidate. But do
>> we really need to invalidate a cache when we are sending an ITS command?
>> In my understanding it is sufficient to clean a cache only and Linux
>> uses clean&invalidate just out of convenience. Is this correct?
>
> It really depends how you look at it. We use DC CIVA as the standard
> way to give a buffer to a device, as that's what the DMA API
> does. Switching to a simple clean is possible, but I don't really see
> what it brings you.
>
> ITS commands are usually written as a single command followed by a
> SYNC/VSYNC. That's a total a 8 64bit words, which makes a cache line
> on 99.999% of the implementations.
>
> What do you gain by keeping the cache line around? Not much. By the
> time you go around the command queue and need the same data again, it
> will have been evicted from your L1 already.

This is a great point. Thank you.

-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-09-22 22:03 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-19 11:28 [PATCH v1 0/3] ARM: ITS: implement TODO and fix issues with cache Volodymyr Babchuk
2023-09-19 11:28 ` [PATCH v1 1/3] ARM: GICv3 ITS: issue INVALL command after mapping host events Volodymyr Babchuk
2023-09-19 12:22   ` Julien Grall
2023-09-19 11:28 ` [PATCH v1 2/3] ARM: GICv3 ITS: do not invalidate memory while sending a command Volodymyr Babchuk
2023-09-19 13:02   ` Julien Grall
2023-09-19 14:36     ` Volodymyr Babchuk
2023-09-19 15:19       ` Julien Grall
2023-09-22  0:22         ` Volodymyr Babchuk
2023-09-22 10:31           ` Marc Zyngier
2023-09-22 22:02             ` Volodymyr Babchuk
2023-09-19 11:28 ` [PATCH v1 3/3] ARM: GICv3 ITS: flush all buffers, not just command queue Volodymyr Babchuk
2023-09-19 13:10   ` Julien Grall
2023-09-19 13:14     ` Julien Grall
2023-09-19 19:17       ` Stewart Hildebrand
2023-09-19 19:07 ` [PATCH v1 0/3] ARM: ITS: implement TODO and fix issues with cache Stewart Hildebrand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.