All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: arm/arm64: vgic: Prevent VGIC_ADDR_TO_INTID from emiting divisions
@ 2016-10-29 11:19 ` Marc Zyngier
  0 siblings, 0 replies; 8+ messages in thread
From: Marc Zyngier @ 2016-10-29 11:19 UTC (permalink / raw)
  To: linux-arm-kernel, kvm, kvmarm; +Cc: Andre Przywara

Using non-constant number of bits for VGIC_ADDR_TO_INTID() leads
to gcc 6.1 emiting calls to __aeabi_uldivmod, which the kernel
does not implement.

As we really don't want to implement complex division in the kernel,
the only other option is to prove to the compiler that there is only
a few values that are possible for the number of bits per IRQ, and
that they are all power of 2.

We turn the VGIC_ADDR_TO_INTID macro into a switch that looks for
the supported set of values (1, 2, 8, 64), and perform the computation
accordingly. When "bits" is a constant, the compiler optimizes
away the other cases. If not, we end-up with a small number of cases
that GCC optimises reasonably well. Out of range values are detected
both at build time (constants) and at run time (variables).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
This should be applied *before* Andre's patch fixing out of bound SPIs.

 virt/kvm/arm/vgic/vgic-mmio.h | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/vgic/vgic-mmio.h b/virt/kvm/arm/vgic/vgic-mmio.h
index 4c34d39..a457282 100644
--- a/virt/kvm/arm/vgic/vgic-mmio.h
+++ b/virt/kvm/arm/vgic/vgic-mmio.h
@@ -57,10 +57,41 @@ extern struct kvm_io_device_ops kvm_io_gic_ops;
  * multiplication with the inverted fraction, and scale up both the
  * numerator and denominator with 8 to support at most 64 bits per IRQ:
  */
-#define VGIC_ADDR_TO_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
+#define __VGIC_ADDR_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
 					64 / (bits) / 8)
 
 /*
+ * Perform the same computation, but also handle non-constant number
+ * of bits. We only care about the few cases that are required by
+ * GICv2/v3.
+ */
+#define VGIC_ADDR_TO_INTID(addr, bits)				\
+	({							\
+		u32 __v;					\
+		switch((bits)) {				\
+		case 1:						\
+			__v = __VGIC_ADDR_INTID((addr), 1);	\
+			break;					\
+		case 2:						\
+			__v = __VGIC_ADDR_INTID((addr), 2);	\
+			break;					\
+		case 8:						\
+			__v = __VGIC_ADDR_INTID((addr), 8);	\
+			break;					\
+		case 64:					\
+			__v = __VGIC_ADDR_INTID((addr), 64);	\
+			break;					\
+		default:					\
+			if (__builtin_constant_p((bits)))	\
+				BUILD_BUG();			\
+			else					\
+				BUG();				\
+		}						\
+								\
+		__v;						\
+	})
+
+/*
  * Some VGIC registers store per-IRQ information, with a different number
  * of bits per IRQ. For those registers this macro is used.
  * The _WITH_LENGTH version instantiates registers with a fixed length
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH] KVM: arm/arm64: vgic: Prevent VGIC_ADDR_TO_INTID from emiting divisions
@ 2016-10-29 11:19 ` Marc Zyngier
  0 siblings, 0 replies; 8+ messages in thread
From: Marc Zyngier @ 2016-10-29 11:19 UTC (permalink / raw)
  To: linux-arm-kernel

Using non-constant number of bits for VGIC_ADDR_TO_INTID() leads
to gcc 6.1 emiting calls to __aeabi_uldivmod, which the kernel
does not implement.

As we really don't want to implement complex division in the kernel,
the only other option is to prove to the compiler that there is only
a few values that are possible for the number of bits per IRQ, and
that they are all power of 2.

We turn the VGIC_ADDR_TO_INTID macro into a switch that looks for
the supported set of values (1, 2, 8, 64), and perform the computation
accordingly. When "bits" is a constant, the compiler optimizes
away the other cases. If not, we end-up with a small number of cases
that GCC optimises reasonably well. Out of range values are detected
both at build time (constants) and at run time (variables).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
This should be applied *before* Andre's patch fixing out of bound SPIs.

 virt/kvm/arm/vgic/vgic-mmio.h | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/vgic/vgic-mmio.h b/virt/kvm/arm/vgic/vgic-mmio.h
index 4c34d39..a457282 100644
--- a/virt/kvm/arm/vgic/vgic-mmio.h
+++ b/virt/kvm/arm/vgic/vgic-mmio.h
@@ -57,10 +57,41 @@ extern struct kvm_io_device_ops kvm_io_gic_ops;
  * multiplication with the inverted fraction, and scale up both the
  * numerator and denominator with 8 to support at most 64 bits per IRQ:
  */
-#define VGIC_ADDR_TO_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
+#define __VGIC_ADDR_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
 					64 / (bits) / 8)
 
 /*
+ * Perform the same computation, but also handle non-constant number
+ * of bits. We only care about the few cases that are required by
+ * GICv2/v3.
+ */
+#define VGIC_ADDR_TO_INTID(addr, bits)				\
+	({							\
+		u32 __v;					\
+		switch((bits)) {				\
+		case 1:						\
+			__v = __VGIC_ADDR_INTID((addr), 1);	\
+			break;					\
+		case 2:						\
+			__v = __VGIC_ADDR_INTID((addr), 2);	\
+			break;					\
+		case 8:						\
+			__v = __VGIC_ADDR_INTID((addr), 8);	\
+			break;					\
+		case 64:					\
+			__v = __VGIC_ADDR_INTID((addr), 64);	\
+			break;					\
+		default:					\
+			if (__builtin_constant_p((bits)))	\
+				BUILD_BUG();			\
+			else					\
+				BUG();				\
+		}						\
+								\
+		__v;						\
+	})
+
+/*
  * Some VGIC registers store per-IRQ information, with a different number
  * of bits per IRQ. For those registers this macro is used.
  * The _WITH_LENGTH version instantiates registers with a fixed length
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: arm/arm64: vgic: Prevent VGIC_ADDR_TO_INTID from emiting divisions
  2016-10-29 11:19 ` Marc Zyngier
@ 2016-11-01 15:28   ` Christoffer Dall
  -1 siblings, 0 replies; 8+ messages in thread
From: Christoffer Dall @ 2016-11-01 15:28 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Andre Przywara, kvm, linux-arm-kernel, kvmarm

On Sat, Oct 29, 2016 at 12:19:01PM +0100, Marc Zyngier wrote:
> Using non-constant number of bits for VGIC_ADDR_TO_INTID() leads
> to gcc 6.1 emiting calls to __aeabi_uldivmod, which the kernel
> does not implement.
> 
> As we really don't want to implement complex division in the kernel,
> the only other option is to prove to the compiler that there is only
> a few values that are possible for the number of bits per IRQ, and
> that they are all power of 2.
> 
> We turn the VGIC_ADDR_TO_INTID macro into a switch that looks for
> the supported set of values (1, 2, 8, 64), and perform the computation
> accordingly. When "bits" is a constant, the compiler optimizes
> away the other cases. If not, we end-up with a small number of cases
> that GCC optimises reasonably well. Out of range values are detected
> both at build time (constants) and at run time (variables).
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
> This should be applied *before* Andre's patch fixing out of bound SPIs.
> 
>  virt/kvm/arm/vgic/vgic-mmio.h | 33 ++++++++++++++++++++++++++++++++-
>  1 file changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-mmio.h b/virt/kvm/arm/vgic/vgic-mmio.h
> index 4c34d39..a457282 100644
> --- a/virt/kvm/arm/vgic/vgic-mmio.h
> +++ b/virt/kvm/arm/vgic/vgic-mmio.h
> @@ -57,10 +57,41 @@ extern struct kvm_io_device_ops kvm_io_gic_ops;
>   * multiplication with the inverted fraction, and scale up both the
>   * numerator and denominator with 8 to support at most 64 bits per IRQ:
>   */
> -#define VGIC_ADDR_TO_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
> +#define __VGIC_ADDR_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
>  					64 / (bits) / 8)
>  
>  /*
> + * Perform the same computation, but also handle non-constant number
> + * of bits. We only care about the few cases that are required by
> + * GICv2/v3.
> + */
> +#define VGIC_ADDR_TO_INTID(addr, bits)				\
> +	({							\
> +		u32 __v;					\
> +		switch((bits)) {				\
> +		case 1:						\
> +			__v = __VGIC_ADDR_INTID((addr), 1);	\
> +			break;					\
> +		case 2:						\
> +			__v = __VGIC_ADDR_INTID((addr), 2);	\
> +			break;					\
> +		case 8:						\
> +			__v = __VGIC_ADDR_INTID((addr), 8);	\
> +			break;					\
> +		case 64:					\
> +			__v = __VGIC_ADDR_INTID((addr), 64);	\
> +			break;					\
> +		default:					\
> +			if (__builtin_constant_p((bits)))	\
> +				BUILD_BUG();			\
> +			else					\
> +				BUG();				\
> +		}						\
> +								\
> +		__v;						\
> +	})
> +
> +/*
>   * Some VGIC registers store per-IRQ information, with a different number
>   * of bits per IRQ. For those registers this macro is used.
>   * The _WITH_LENGTH version instantiates registers with a fixed length
> -- 
> 2.9.3
> 

Looks functionally correct, just wondering if it's cleaner to turn the
whole thing into a static inline, or if it can be rewritten to use
shifts with any benefit.

In any case, if you like this version:

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] KVM: arm/arm64: vgic: Prevent VGIC_ADDR_TO_INTID from emiting divisions
@ 2016-11-01 15:28   ` Christoffer Dall
  0 siblings, 0 replies; 8+ messages in thread
From: Christoffer Dall @ 2016-11-01 15:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Sat, Oct 29, 2016 at 12:19:01PM +0100, Marc Zyngier wrote:
> Using non-constant number of bits for VGIC_ADDR_TO_INTID() leads
> to gcc 6.1 emiting calls to __aeabi_uldivmod, which the kernel
> does not implement.
> 
> As we really don't want to implement complex division in the kernel,
> the only other option is to prove to the compiler that there is only
> a few values that are possible for the number of bits per IRQ, and
> that they are all power of 2.
> 
> We turn the VGIC_ADDR_TO_INTID macro into a switch that looks for
> the supported set of values (1, 2, 8, 64), and perform the computation
> accordingly. When "bits" is a constant, the compiler optimizes
> away the other cases. If not, we end-up with a small number of cases
> that GCC optimises reasonably well. Out of range values are detected
> both at build time (constants) and at run time (variables).
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
> This should be applied *before* Andre's patch fixing out of bound SPIs.
> 
>  virt/kvm/arm/vgic/vgic-mmio.h | 33 ++++++++++++++++++++++++++++++++-
>  1 file changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-mmio.h b/virt/kvm/arm/vgic/vgic-mmio.h
> index 4c34d39..a457282 100644
> --- a/virt/kvm/arm/vgic/vgic-mmio.h
> +++ b/virt/kvm/arm/vgic/vgic-mmio.h
> @@ -57,10 +57,41 @@ extern struct kvm_io_device_ops kvm_io_gic_ops;
>   * multiplication with the inverted fraction, and scale up both the
>   * numerator and denominator with 8 to support at most 64 bits per IRQ:
>   */
> -#define VGIC_ADDR_TO_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
> +#define __VGIC_ADDR_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
>  					64 / (bits) / 8)
>  
>  /*
> + * Perform the same computation, but also handle non-constant number
> + * of bits. We only care about the few cases that are required by
> + * GICv2/v3.
> + */
> +#define VGIC_ADDR_TO_INTID(addr, bits)				\
> +	({							\
> +		u32 __v;					\
> +		switch((bits)) {				\
> +		case 1:						\
> +			__v = __VGIC_ADDR_INTID((addr), 1);	\
> +			break;					\
> +		case 2:						\
> +			__v = __VGIC_ADDR_INTID((addr), 2);	\
> +			break;					\
> +		case 8:						\
> +			__v = __VGIC_ADDR_INTID((addr), 8);	\
> +			break;					\
> +		case 64:					\
> +			__v = __VGIC_ADDR_INTID((addr), 64);	\
> +			break;					\
> +		default:					\
> +			if (__builtin_constant_p((bits)))	\
> +				BUILD_BUG();			\
> +			else					\
> +				BUG();				\
> +		}						\
> +								\
> +		__v;						\
> +	})
> +
> +/*
>   * Some VGIC registers store per-IRQ information, with a different number
>   * of bits per IRQ. For those registers this macro is used.
>   * The _WITH_LENGTH version instantiates registers with a fixed length
> -- 
> 2.9.3
> 

Looks functionally correct, just wondering if it's cleaner to turn the
whole thing into a static inline, or if it can be rewritten to use
shifts with any benefit.

In any case, if you like this version:

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: arm/arm64: vgic: Prevent VGIC_ADDR_TO_INTID from emiting divisions
  2016-11-01 15:28   ` Christoffer Dall
@ 2016-11-01 16:50     ` Andre Przywara
  -1 siblings, 0 replies; 8+ messages in thread
From: Andre Przywara @ 2016-11-01 16:50 UTC (permalink / raw)
  To: Christoffer Dall, Marc Zyngier; +Cc: kvm, linux-arm-kernel, kvmarm

Hej,

On 01/11/16 15:28, Christoffer Dall wrote:
> On Sat, Oct 29, 2016 at 12:19:01PM +0100, Marc Zyngier wrote:
>> Using non-constant number of bits for VGIC_ADDR_TO_INTID() leads
>> to gcc 6.1 emiting calls to __aeabi_uldivmod, which the kernel
>> does not implement.
>>
>> As we really don't want to implement complex division in the kernel,
>> the only other option is to prove to the compiler that there is only
>> a few values that are possible for the number of bits per IRQ, and
>> that they are all power of 2.
>>
>> We turn the VGIC_ADDR_TO_INTID macro into a switch that looks for
>> the supported set of values (1, 2, 8, 64), and perform the computation
>> accordingly. When "bits" is a constant, the compiler optimizes
>> away the other cases. If not, we end-up with a small number of cases
>> that GCC optimises reasonably well. Out of range values are detected
>> both at build time (constants) and at run time (variables).
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>> This should be applied *before* Andre's patch fixing out of bound SPIs.
>>
>>  virt/kvm/arm/vgic/vgic-mmio.h | 33 ++++++++++++++++++++++++++++++++-
>>  1 file changed, 32 insertions(+), 1 deletion(-)
>>
>> diff --git a/virt/kvm/arm/vgic/vgic-mmio.h b/virt/kvm/arm/vgic/vgic-mmio.h
>> index 4c34d39..a457282 100644
>> --- a/virt/kvm/arm/vgic/vgic-mmio.h
>> +++ b/virt/kvm/arm/vgic/vgic-mmio.h
>> @@ -57,10 +57,41 @@ extern struct kvm_io_device_ops kvm_io_gic_ops;
>>   * multiplication with the inverted fraction, and scale up both the
>>   * numerator and denominator with 8 to support at most 64 bits per IRQ:
>>   */
>> -#define VGIC_ADDR_TO_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
>> +#define __VGIC_ADDR_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
>>  					64 / (bits) / 8)

I remember we discussed this in length some months ago, but I was
wondering if this isn't simply:
	((addr & mask) * 8) / bits
and thus can be written as:
	((addr & mask) * 8) >> ilog2(bits)
We require <bits> to be a power of two anyway for the MASK macro.

ilog2(constant) is nicely optimized at compile time, but even at runtime
on both ARM variants it boils down to "31 - clz(bits)", which are two or
three instructions AFAICS.

Does that make sense or am I missing something here?

I changed this in my patch and adjusted the comment, quick testing seems
to be fine on Midway and Juno.

Will send it out in a minute, if no-one objects.

Cheers,
Andre.

>>  
>>  /*
>> + * Perform the same computation, but also handle non-constant number
>> + * of bits. We only care about the few cases that are required by
>> + * GICv2/v3.
>> + */
>> +#define VGIC_ADDR_TO_INTID(addr, bits)				\
>> +	({							\
>> +		u32 __v;					\
>> +		switch((bits)) {				\
>> +		case 1:						\
>> +			__v = __VGIC_ADDR_INTID((addr), 1);	\
>> +			break;					\
>> +		case 2:						\
>> +			__v = __VGIC_ADDR_INTID((addr), 2);	\
>> +			break;					\
>> +		case 8:						\
>> +			__v = __VGIC_ADDR_INTID((addr), 8);	\
>> +			break;					\
>> +		case 64:					\
>> +			__v = __VGIC_ADDR_INTID((addr), 64);	\
>> +			break;					\
>> +		default:					\
>> +			if (__builtin_constant_p((bits)))	\
>> +				BUILD_BUG();			\
>> +			else					\
>> +				BUG();				\
>> +		}						\
>> +								\
>> +		__v;						\
>> +	})
>> +
>> +/*
>>   * Some VGIC registers store per-IRQ information, with a different number
>>   * of bits per IRQ. For those registers this macro is used.
>>   * The _WITH_LENGTH version instantiates registers with a fixed length
>> -- 
>> 2.9.3
>>
> 
> Looks functionally correct, just wondering if it's cleaner to turn the
> whole thing into a static inline, or if it can be rewritten to use
> shifts with any benefit.
> 
> In any case, if you like this version:
> 
> Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] KVM: arm/arm64: vgic: Prevent VGIC_ADDR_TO_INTID from emiting divisions
@ 2016-11-01 16:50     ` Andre Przywara
  0 siblings, 0 replies; 8+ messages in thread
From: Andre Przywara @ 2016-11-01 16:50 UTC (permalink / raw)
  To: linux-arm-kernel

Hej,

On 01/11/16 15:28, Christoffer Dall wrote:
> On Sat, Oct 29, 2016 at 12:19:01PM +0100, Marc Zyngier wrote:
>> Using non-constant number of bits for VGIC_ADDR_TO_INTID() leads
>> to gcc 6.1 emiting calls to __aeabi_uldivmod, which the kernel
>> does not implement.
>>
>> As we really don't want to implement complex division in the kernel,
>> the only other option is to prove to the compiler that there is only
>> a few values that are possible for the number of bits per IRQ, and
>> that they are all power of 2.
>>
>> We turn the VGIC_ADDR_TO_INTID macro into a switch that looks for
>> the supported set of values (1, 2, 8, 64), and perform the computation
>> accordingly. When "bits" is a constant, the compiler optimizes
>> away the other cases. If not, we end-up with a small number of cases
>> that GCC optimises reasonably well. Out of range values are detected
>> both at build time (constants) and at run time (variables).
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>> This should be applied *before* Andre's patch fixing out of bound SPIs.
>>
>>  virt/kvm/arm/vgic/vgic-mmio.h | 33 ++++++++++++++++++++++++++++++++-
>>  1 file changed, 32 insertions(+), 1 deletion(-)
>>
>> diff --git a/virt/kvm/arm/vgic/vgic-mmio.h b/virt/kvm/arm/vgic/vgic-mmio.h
>> index 4c34d39..a457282 100644
>> --- a/virt/kvm/arm/vgic/vgic-mmio.h
>> +++ b/virt/kvm/arm/vgic/vgic-mmio.h
>> @@ -57,10 +57,41 @@ extern struct kvm_io_device_ops kvm_io_gic_ops;
>>   * multiplication with the inverted fraction, and scale up both the
>>   * numerator and denominator with 8 to support at most 64 bits per IRQ:
>>   */
>> -#define VGIC_ADDR_TO_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
>> +#define __VGIC_ADDR_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
>>  					64 / (bits) / 8)

I remember we discussed this in length some months ago, but I was
wondering if this isn't simply:
	((addr & mask) * 8) / bits
and thus can be written as:
	((addr & mask) * 8) >> ilog2(bits)
We require <bits> to be a power of two anyway for the MASK macro.

ilog2(constant) is nicely optimized at compile time, but even at runtime
on both ARM variants it boils down to "31 - clz(bits)", which are two or
three instructions AFAICS.

Does that make sense or am I missing something here?

I changed this in my patch and adjusted the comment, quick testing seems
to be fine on Midway and Juno.

Will send it out in a minute, if no-one objects.

Cheers,
Andre.

>>  
>>  /*
>> + * Perform the same computation, but also handle non-constant number
>> + * of bits. We only care about the few cases that are required by
>> + * GICv2/v3.
>> + */
>> +#define VGIC_ADDR_TO_INTID(addr, bits)				\
>> +	({							\
>> +		u32 __v;					\
>> +		switch((bits)) {				\
>> +		case 1:						\
>> +			__v = __VGIC_ADDR_INTID((addr), 1);	\
>> +			break;					\
>> +		case 2:						\
>> +			__v = __VGIC_ADDR_INTID((addr), 2);	\
>> +			break;					\
>> +		case 8:						\
>> +			__v = __VGIC_ADDR_INTID((addr), 8);	\
>> +			break;					\
>> +		case 64:					\
>> +			__v = __VGIC_ADDR_INTID((addr), 64);	\
>> +			break;					\
>> +		default:					\
>> +			if (__builtin_constant_p((bits)))	\
>> +				BUILD_BUG();			\
>> +			else					\
>> +				BUG();				\
>> +		}						\
>> +								\
>> +		__v;						\
>> +	})
>> +
>> +/*
>>   * Some VGIC registers store per-IRQ information, with a different number
>>   * of bits per IRQ. For those registers this macro is used.
>>   * The _WITH_LENGTH version instantiates registers with a fixed length
>> -- 
>> 2.9.3
>>
> 
> Looks functionally correct, just wondering if it's cleaner to turn the
> whole thing into a static inline, or if it can be rewritten to use
> shifts with any benefit.
> 
> In any case, if you like this version:
> 
> Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] KVM: arm/arm64: vgic: Prevent VGIC_ADDR_TO_INTID from emiting divisions
  2016-11-01 16:50     ` Andre Przywara
@ 2016-11-01 19:31       ` Christoffer Dall
  -1 siblings, 0 replies; 8+ messages in thread
From: Christoffer Dall @ 2016-11-01 19:31 UTC (permalink / raw)
  To: Andre Przywara; +Cc: Marc Zyngier, kvm, linux-arm-kernel, kvmarm

On Tue, Nov 01, 2016 at 04:50:26PM +0000, Andre Przywara wrote:
> Hej,
> 
> On 01/11/16 15:28, Christoffer Dall wrote:
> > On Sat, Oct 29, 2016 at 12:19:01PM +0100, Marc Zyngier wrote:
> >> Using non-constant number of bits for VGIC_ADDR_TO_INTID() leads
> >> to gcc 6.1 emiting calls to __aeabi_uldivmod, which the kernel
> >> does not implement.
> >>
> >> As we really don't want to implement complex division in the kernel,
> >> the only other option is to prove to the compiler that there is only
> >> a few values that are possible for the number of bits per IRQ, and
> >> that they are all power of 2.
> >>
> >> We turn the VGIC_ADDR_TO_INTID macro into a switch that looks for
> >> the supported set of values (1, 2, 8, 64), and perform the computation
> >> accordingly. When "bits" is a constant, the compiler optimizes
> >> away the other cases. If not, we end-up with a small number of cases
> >> that GCC optimises reasonably well. Out of range values are detected
> >> both at build time (constants) and at run time (variables).
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >> This should be applied *before* Andre's patch fixing out of bound SPIs.
> >>
> >>  virt/kvm/arm/vgic/vgic-mmio.h | 33 ++++++++++++++++++++++++++++++++-
> >>  1 file changed, 32 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/virt/kvm/arm/vgic/vgic-mmio.h b/virt/kvm/arm/vgic/vgic-mmio.h
> >> index 4c34d39..a457282 100644
> >> --- a/virt/kvm/arm/vgic/vgic-mmio.h
> >> +++ b/virt/kvm/arm/vgic/vgic-mmio.h
> >> @@ -57,10 +57,41 @@ extern struct kvm_io_device_ops kvm_io_gic_ops;
> >>   * multiplication with the inverted fraction, and scale up both the
> >>   * numerator and denominator with 8 to support at most 64 bits per IRQ:
> >>   */
> >> -#define VGIC_ADDR_TO_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
> >> +#define __VGIC_ADDR_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
> >>  					64 / (bits) / 8)
> 
> I remember we discussed this in length some months ago, but I was
> wondering if this isn't simply:
> 	((addr & mask) * 8) / bits

that's just dividing 8 into the 64, so that should be fine, yes.

> and thus can be written as:
> 	((addr & mask) * 8) >> ilog2(bits)

right, I follow that.

> We require <bits> to be a power of two anyway for the MASK macro.
> 
> ilog2(constant) is nicely optimized at compile time, but even at runtime
> on both ARM variants it boils down to "31 - clz(bits)", which are two or
> three instructions AFAICS.

cool with the ilog2 macro.

> 
> Does that make sense or am I missing something here?

makes sense I think.  Good luck writing a comment so that I can
understand this calculation later ;)

> 
> I changed this in my patch and adjusted the comment, quick testing seems
> to be fine on Midway and Juno.
> 
> Will send it out in a minute, if no-one objects.
> 
I don't object.

-Christoffer

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] KVM: arm/arm64: vgic: Prevent VGIC_ADDR_TO_INTID from emiting divisions
@ 2016-11-01 19:31       ` Christoffer Dall
  0 siblings, 0 replies; 8+ messages in thread
From: Christoffer Dall @ 2016-11-01 19:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 01, 2016 at 04:50:26PM +0000, Andre Przywara wrote:
> Hej,
> 
> On 01/11/16 15:28, Christoffer Dall wrote:
> > On Sat, Oct 29, 2016 at 12:19:01PM +0100, Marc Zyngier wrote:
> >> Using non-constant number of bits for VGIC_ADDR_TO_INTID() leads
> >> to gcc 6.1 emiting calls to __aeabi_uldivmod, which the kernel
> >> does not implement.
> >>
> >> As we really don't want to implement complex division in the kernel,
> >> the only other option is to prove to the compiler that there is only
> >> a few values that are possible for the number of bits per IRQ, and
> >> that they are all power of 2.
> >>
> >> We turn the VGIC_ADDR_TO_INTID macro into a switch that looks for
> >> the supported set of values (1, 2, 8, 64), and perform the computation
> >> accordingly. When "bits" is a constant, the compiler optimizes
> >> away the other cases. If not, we end-up with a small number of cases
> >> that GCC optimises reasonably well. Out of range values are detected
> >> both at build time (constants) and at run time (variables).
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >> This should be applied *before* Andre's patch fixing out of bound SPIs.
> >>
> >>  virt/kvm/arm/vgic/vgic-mmio.h | 33 ++++++++++++++++++++++++++++++++-
> >>  1 file changed, 32 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/virt/kvm/arm/vgic/vgic-mmio.h b/virt/kvm/arm/vgic/vgic-mmio.h
> >> index 4c34d39..a457282 100644
> >> --- a/virt/kvm/arm/vgic/vgic-mmio.h
> >> +++ b/virt/kvm/arm/vgic/vgic-mmio.h
> >> @@ -57,10 +57,41 @@ extern struct kvm_io_device_ops kvm_io_gic_ops;
> >>   * multiplication with the inverted fraction, and scale up both the
> >>   * numerator and denominator with 8 to support at most 64 bits per IRQ:
> >>   */
> >> -#define VGIC_ADDR_TO_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
> >> +#define __VGIC_ADDR_INTID(addr, bits)  (((addr) & VGIC_ADDR_IRQ_MASK(bits)) * \
> >>  					64 / (bits) / 8)
> 
> I remember we discussed this in length some months ago, but I was
> wondering if this isn't simply:
> 	((addr & mask) * 8) / bits

that's just dividing 8 into the 64, so that should be fine, yes.

> and thus can be written as:
> 	((addr & mask) * 8) >> ilog2(bits)

right, I follow that.

> We require <bits> to be a power of two anyway for the MASK macro.
> 
> ilog2(constant) is nicely optimized at compile time, but even at runtime
> on both ARM variants it boils down to "31 - clz(bits)", which are two or
> three instructions AFAICS.

cool with the ilog2 macro.

> 
> Does that make sense or am I missing something here?

makes sense I think.  Good luck writing a comment so that I can
understand this calculation later ;)

> 
> I changed this in my patch and adjusted the comment, quick testing seems
> to be fine on Midway and Juno.
> 
> Will send it out in a minute, if no-one objects.
> 
I don't object.

-Christoffer

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-11-01 19:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-29 11:19 [PATCH] KVM: arm/arm64: vgic: Prevent VGIC_ADDR_TO_INTID from emiting divisions Marc Zyngier
2016-10-29 11:19 ` Marc Zyngier
2016-11-01 15:28 ` Christoffer Dall
2016-11-01 15:28   ` Christoffer Dall
2016-11-01 16:50   ` Andre Przywara
2016-11-01 16:50     ` Andre Przywara
2016-11-01 19:31     ` Christoffer Dall
2016-11-01 19:31       ` Christoffer Dall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.