All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-15  8:36 ` Zhen Lei
  0 siblings, 0 replies; 26+ messages in thread
From: Zhen Lei @ 2018-10-15  8:36 UTC (permalink / raw)
  To: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
	linux-kernel
  Cc: Zhen Lei, LinuxArm

ITS translation register map:
0x0000-0x003C	Reserved
0x0040		GITS_TRANSLATER
0x0044-0xFFFC	Reserved

The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
data will be written to MSIAddress each time.

MSIAddr: |----4bytes----|----4bytes----|
	 |    MSIData   |    IMPDEF    |

There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
But it will overwrite the 4 bytes memory following "sync_count". It's very
luckly that the previous and the next neighbour of "sync_count" are both aligned
by 8 bytes, so no problem is met now.

It's good to explicitly add a workaround:
1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
   aligned by 8 bytes.
2. Add a "u64" union member to make sure the 4 bytes padding is always exist.

There is no functional change.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 drivers/iommu/arm-smmu-v3.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 5059d09..a07bc0d 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -586,7 +586,10 @@ struct arm_smmu_device {
 
 	struct arm_smmu_strtab_cfg	strtab_cfg;
 
+	union {
+	u64				padding; /* workaround for Hisilicon */
 	u32				sync_count;
+	} __attribute__((aligned(8)));
 
 	/* IOMMU core code handle */
 	struct iommu_device		iommu;
-- 
1.8.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-15  8:36 ` Zhen Lei
  0 siblings, 0 replies; 26+ messages in thread
From: Zhen Lei @ 2018-10-15  8:36 UTC (permalink / raw)
  To: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
	linux-kernel
  Cc: LinuxArm

ITS translation register map:
0x0000-0x003C	Reserved
0x0040		GITS_TRANSLATER
0x0044-0xFFFC	Reserved

The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
data will be written to MSIAddress each time.

MSIAddr: |----4bytes----|----4bytes----|
	 |    MSIData   |    IMPDEF    |

There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
But it will overwrite the 4 bytes memory following "sync_count". It's very
luckly that the previous and the next neighbour of "sync_count" are both aligned
by 8 bytes, so no problem is met now.

It's good to explicitly add a workaround:
1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
   aligned by 8 bytes.
2. Add a "u64" union member to make sure the 4 bytes padding is always exist.

There is no functional change.

Signed-off-by: Zhen Lei <thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
---
 drivers/iommu/arm-smmu-v3.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 5059d09..a07bc0d 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -586,7 +586,10 @@ struct arm_smmu_device {
 
 	struct arm_smmu_strtab_cfg	strtab_cfg;
 
+	union {
+	u64				padding; /* workaround for Hisilicon */
 	u32				sync_count;
+	} __attribute__((aligned(8)));
 
 	/* IOMMU core code handle */
 	struct iommu_device		iommu;
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-15  8:36 ` Zhen Lei
  0 siblings, 0 replies; 26+ messages in thread
From: Zhen Lei @ 2018-10-15  8:36 UTC (permalink / raw)
  To: linux-arm-kernel

ITS translation register map:
0x0000-0x003C	Reserved
0x0040		GITS_TRANSLATER
0x0044-0xFFFC	Reserved

The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
data will be written to MSIAddress each time.

MSIAddr: |----4bytes----|----4bytes----|
	 |    MSIData   |    IMPDEF    |

There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
But it will overwrite the 4 bytes memory following "sync_count". It's very
luckly that the previous and the next neighbour of "sync_count" are both aligned
by 8 bytes, so no problem is met now.

It's good to explicitly add a workaround:
1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
   aligned by 8 bytes.
2. Add a "u64" union member to make sure the 4 bytes padding is always exist.

There is no functional change.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
---
 drivers/iommu/arm-smmu-v3.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 5059d09..a07bc0d 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -586,7 +586,10 @@ struct arm_smmu_device {
 
 	struct arm_smmu_strtab_cfg	strtab_cfg;
 
+	union {
+	u64				padding; /* workaround for Hisilicon */
 	u32				sync_count;
+	} __attribute__((aligned(8)));
 
 	/* IOMMU core code handle */
 	struct iommu_device		iommu;
-- 
1.8.3

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
  2018-10-15  8:36 ` Zhen Lei
@ 2018-10-15 11:17   ` John Garry
  -1 siblings, 0 replies; 26+ messages in thread
From: John Garry @ 2018-10-15 11:17 UTC (permalink / raw)
  To: Zhen Lei, Robin Murphy, Will Deacon, Joerg Roedel,
	linux-arm-kernel, iommu, linux-kernel
  Cc: LinuxArm

On 15/10/2018 09:36, Zhen Lei wrote:
> ITS translation register map:
> 0x0000-0x003C	Reserved
> 0x0040		GITS_TRANSLATER
> 0x0044-0xFFFC	Reserved
>

Can you add a better opening than the ITS translation register map?

> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
> data will be written to MSIAddress each time.
>
> MSIAddr: |----4bytes----|----4bytes----|
> 	 |    MSIData   |    IMPDEF    |
>
> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
> But it will overwrite the 4 bytes memory following "sync_count". It's very

I think arm_smmu_device.sync_count is better, or "sync_count member in 
the the smmu driver control struct".

> luckly that the previous and the next neighbour of "sync_count" are both aligned

/s/luckly/luckily or fortunately/

> by 8 bytes, so no problem is met now.
>
> It's good to explicitly add a workaround:
> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>    aligned by 8 bytes.
> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
>
> There is no functional change.
>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  drivers/iommu/arm-smmu-v3.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 5059d09..a07bc0d 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>
>  	struct arm_smmu_strtab_cfg	strtab_cfg;
>
> +	union {
> +	u64				padding; /* workaround for Hisilicon */

I think that a more detailed comment is required.

>  	u32				sync_count;

Can you indent these 2 members? However - as discussed internally - this 
may have endian issue so better to declare full 64b struct.

> +	} __attribute__((aligned(8)));
>
>  	/* IOMMU core code handle */
>  	struct iommu_device		iommu;
>
Thanks




^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-15 11:17   ` John Garry
  0 siblings, 0 replies; 26+ messages in thread
From: John Garry @ 2018-10-15 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/10/2018 09:36, Zhen Lei wrote:
> ITS translation register map:
> 0x0000-0x003C	Reserved
> 0x0040		GITS_TRANSLATER
> 0x0044-0xFFFC	Reserved
>

Can you add a better opening than the ITS translation register map?

> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
> data will be written to MSIAddress each time.
>
> MSIAddr: |----4bytes----|----4bytes----|
> 	 |    MSIData   |    IMPDEF    |
>
> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
> But it will overwrite the 4 bytes memory following "sync_count". It's very

I think arm_smmu_device.sync_count is better, or "sync_count member in 
the the smmu driver control struct".

> luckly that the previous and the next neighbour of "sync_count" are both aligned

/s/luckly/luckily or fortunately/

> by 8 bytes, so no problem is met now.
>
> It's good to explicitly add a workaround:
> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>    aligned by 8 bytes.
> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
>
> There is no functional change.
>
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  drivers/iommu/arm-smmu-v3.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 5059d09..a07bc0d 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>
>  	struct arm_smmu_strtab_cfg	strtab_cfg;
>
> +	union {
> +	u64				padding; /* workaround for Hisilicon */

I think that a more detailed comment is required.

>  	u32				sync_count;

Can you indent these 2 members? However - as discussed internally - this 
may have endian issue so better to declare full 64b struct.

> +	} __attribute__((aligned(8)));
>
>  	/* IOMMU core code handle */
>  	struct iommu_device		iommu;
>
Thanks

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-15 12:46   ` Andrew Murray
  0 siblings, 0 replies; 26+ messages in thread
From: Andrew Murray @ 2018-10-15 12:46 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
	linux-kernel, LinuxArm

Hi Zhen,

On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
> ITS translation register map:
> 0x0000-0x003C	Reserved
> 0x0040		GITS_TRANSLATER
> 0x0044-0xFFFC	Reserved
> 
> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
> data will be written to MSIAddress each time.
> 
> MSIAddr: |----4bytes----|----4bytes----|
> 	 |    MSIData   |    IMPDEF    |
> 
> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
> But it will overwrite the 4 bytes memory following "sync_count". It's very
> luckly that the previous and the next neighbour of "sync_count" are both aligned
> by 8 bytes, so no problem is met now.

My understanding is that MSI's are 32bit memory writes and as such the SMMU
performs a 32bit write in response to the MSI. If so then what is different
with the Hi16xx that causes a problem? Have you been able to able to adjust
the layout of the arm_smmu_device struct to demonstrate this?

Thanks,

Andrew Murray

> 
> It's good to explicitly add a workaround:
> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>    aligned by 8 bytes.
> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
> 
> There is no functional change.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  drivers/iommu/arm-smmu-v3.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 5059d09..a07bc0d 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>  
>  	struct arm_smmu_strtab_cfg	strtab_cfg;
>  
> +	union {
> +	u64				padding; /* workaround for Hisilicon */
>  	u32				sync_count;
> +	} __attribute__((aligned(8)));
>  
>  	/* IOMMU core code handle */
>  	struct iommu_device		iommu;
> -- 
> 1.8.3
> 
> 
> _______________________________________________
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-15 12:46   ` Andrew Murray
  0 siblings, 0 replies; 26+ messages in thread
From: Andrew Murray @ 2018-10-15 12:46 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Will Deacon, linux-kernel, LinuxArm, iommu, Robin Murphy,
	linux-arm-kernel

Hi Zhen,

On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
> ITS translation register map:
> 0x0000-0x003C	Reserved
> 0x0040		GITS_TRANSLATER
> 0x0044-0xFFFC	Reserved
> 
> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
> data will be written to MSIAddress each time.
> 
> MSIAddr: |----4bytes----|----4bytes----|
> 	 |    MSIData   |    IMPDEF    |
> 
> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
> But it will overwrite the 4 bytes memory following "sync_count". It's very
> luckly that the previous and the next neighbour of "sync_count" are both aligned
> by 8 bytes, so no problem is met now.

My understanding is that MSI's are 32bit memory writes and as such the SMMU
performs a 32bit write in response to the MSI. If so then what is different
with the Hi16xx that causes a problem? Have you been able to able to adjust
the layout of the arm_smmu_device struct to demonstrate this?

Thanks,

Andrew Murray

> 
> It's good to explicitly add a workaround:
> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>    aligned by 8 bytes.
> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
> 
> There is no functional change.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/iommu/arm-smmu-v3.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 5059d09..a07bc0d 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>  
>  	struct arm_smmu_strtab_cfg	strtab_cfg;
>  
> +	union {
> +	u64				padding; /* workaround for Hisilicon */
>  	u32				sync_count;
> +	} __attribute__((aligned(8)));
>  
>  	/* IOMMU core code handle */
>  	struct iommu_device		iommu;
> -- 
> 1.8.3
> 
> 
> _______________________________________________
> iommu mailing list
> iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-15 12:46   ` Andrew Murray
  0 siblings, 0 replies; 26+ messages in thread
From: Andrew Murray @ 2018-10-15 12:46 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Zhen,

On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
> ITS translation register map:
> 0x0000-0x003C	Reserved
> 0x0040		GITS_TRANSLATER
> 0x0044-0xFFFC	Reserved
> 
> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
> data will be written to MSIAddress each time.
> 
> MSIAddr: |----4bytes----|----4bytes----|
> 	 |    MSIData   |    IMPDEF    |
> 
> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
> But it will overwrite the 4 bytes memory following "sync_count". It's very
> luckly that the previous and the next neighbour of "sync_count" are both aligned
> by 8 bytes, so no problem is met now.

My understanding is that MSI's are 32bit memory writes and as such the SMMU
performs a 32bit write in response to the MSI. If so then what is different
with the Hi16xx that causes a problem? Have you been able to able to adjust
the layout of the arm_smmu_device struct to demonstrate this?

Thanks,

Andrew Murray

> 
> It's good to explicitly add a workaround:
> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>    aligned by 8 bytes.
> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
> 
> There is no functional change.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  drivers/iommu/arm-smmu-v3.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 5059d09..a07bc0d 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>  
>  	struct arm_smmu_strtab_cfg	strtab_cfg;
>  
> +	union {
> +	u64				padding; /* workaround for Hisilicon */
>  	u32				sync_count;
> +	} __attribute__((aligned(8)));
>  
>  	/* IOMMU core code handle */
>  	struct iommu_device		iommu;
> -- 
> 1.8.3
> 
> 
> _______________________________________________
> iommu mailing list
> iommu at lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-15 13:52   ` Robin Murphy
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2018-10-15 13:52 UTC (permalink / raw)
  To: Zhen Lei, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
	linux-kernel
  Cc: LinuxArm, nd

On 15/10/18 09:36, Zhen Lei wrote:
> ITS translation register map:
> 0x0000-0x003C	Reserved
> 0x0040		GITS_TRANSLATER
> 0x0044-0xFFFC	Reserved
> 
> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
> data will be written to MSIAddress each time.
> 
> MSIAddr: |----4bytes----|----4bytes----|
> 	 |    MSIData   |    IMPDEF    |
> 
> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
> But it will overwrite the 4 bytes memory following "sync_count". It's very
> luckly that the previous and the next neighbour of "sync_count" are both aligned
> by 8 bytes, so no problem is met now.
> 
> It's good to explicitly add a workaround:
> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>     aligned by 8 bytes.
> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.

Surely the u64 member inherently makes the union, and thus the u32 
member as well, 64-bit-aligned anyway?

Robin.

> There is no functional change.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>   drivers/iommu/arm-smmu-v3.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 5059d09..a07bc0d 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>   
>   	struct arm_smmu_strtab_cfg	strtab_cfg;
>   
> +	union {
> +	u64				padding; /* workaround for Hisilicon */
>   	u32				sync_count;
> +	} __attribute__((aligned(8)));
>   
>   	/* IOMMU core code handle */
>   	struct iommu_device		iommu;
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-15 13:52   ` Robin Murphy
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2018-10-15 13:52 UTC (permalink / raw)
  To: Zhen Lei, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
	linux-kernel
  Cc: nd-5wv7dgnIgG8, LinuxArm

On 15/10/18 09:36, Zhen Lei wrote:
> ITS translation register map:
> 0x0000-0x003C	Reserved
> 0x0040		GITS_TRANSLATER
> 0x0044-0xFFFC	Reserved
> 
> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
> data will be written to MSIAddress each time.
> 
> MSIAddr: |----4bytes----|----4bytes----|
> 	 |    MSIData   |    IMPDEF    |
> 
> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
> But it will overwrite the 4 bytes memory following "sync_count". It's very
> luckly that the previous and the next neighbour of "sync_count" are both aligned
> by 8 bytes, so no problem is met now.
> 
> It's good to explicitly add a workaround:
> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>     aligned by 8 bytes.
> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.

Surely the u64 member inherently makes the union, and thus the u32 
member as well, 64-bit-aligned anyway?

Robin.

> There is no functional change.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> ---
>   drivers/iommu/arm-smmu-v3.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 5059d09..a07bc0d 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>   
>   	struct arm_smmu_strtab_cfg	strtab_cfg;
>   
> +	union {
> +	u64				padding; /* workaround for Hisilicon */
>   	u32				sync_count;
> +	} __attribute__((aligned(8)));
>   
>   	/* IOMMU core code handle */
>   	struct iommu_device		iommu;
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-15 13:52   ` Robin Murphy
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2018-10-15 13:52 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/10/18 09:36, Zhen Lei wrote:
> ITS translation register map:
> 0x0000-0x003C	Reserved
> 0x0040		GITS_TRANSLATER
> 0x0044-0xFFFC	Reserved
> 
> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
> data will be written to MSIAddress each time.
> 
> MSIAddr: |----4bytes----|----4bytes----|
> 	 |    MSIData   |    IMPDEF    |
> 
> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
> But it will overwrite the 4 bytes memory following "sync_count". It's very
> luckly that the previous and the next neighbour of "sync_count" are both aligned
> by 8 bytes, so no problem is met now.
> 
> It's good to explicitly add a workaround:
> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>     aligned by 8 bytes.
> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.

Surely the u64 member inherently makes the union, and thus the u32 
member as well, 64-bit-aligned anyway?

Robin.

> There is no functional change.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>   drivers/iommu/arm-smmu-v3.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 5059d09..a07bc0d 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>   
>   	struct arm_smmu_strtab_cfg	strtab_cfg;
>   
> +	union {
> +	u64				padding; /* workaround for Hisilicon */
>   	u32				sync_count;
> +	} __attribute__((aligned(8)));
>   
>   	/* IOMMU core code handle */
>   	struct iommu_device		iommu;
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
  2018-10-15  8:36 ` Zhen Lei
@ 2018-10-15 17:21   ` Will Deacon
  -1 siblings, 0 replies; 26+ messages in thread
From: Will Deacon @ 2018-10-15 17:21 UTC (permalink / raw)
  To: Zhen Lei
  Cc: Robin Murphy, Joerg Roedel, linux-arm-kernel, iommu,
	linux-kernel, LinuxArm

On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
> ITS translation register map:
> 0x0000-0x003C	Reserved
> 0x0040		GITS_TRANSLATER
> 0x0044-0xFFFC	Reserved
> 
> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
> data will be written to MSIAddress each time.
> 
> MSIAddr: |----4bytes----|----4bytes----|
> 	 |    MSIData   |    IMPDEF    |
> 
> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
> But it will overwrite the 4 bytes memory following "sync_count". It's very
> luckly that the previous and the next neighbour of "sync_count" are both aligned
> by 8 bytes, so no problem is met now.
> 
> It's good to explicitly add a workaround:
> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>    aligned by 8 bytes.
> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
> 
> There is no functional change.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  drivers/iommu/arm-smmu-v3.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 5059d09..a07bc0d 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>  
>  	struct arm_smmu_strtab_cfg	strtab_cfg;
>  
> +	union {
> +	u64				padding; /* workaround for Hisilicon */
>  	u32				sync_count;
> +	} __attribute__((aligned(8)));

Won't this already be aligned by the ABI?

Anyway, you'll need to swizzle things for big-endian, I suspect. Maybe you
can do something clever like making sync_count an array of two elements
and determining the offset based on the endianness. Or just keep it simple
like we do for things like struct qrwlock and struct qspinlock and use
#ifdefs.

Also -- you need a comment to explain this insanity :)

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-15 17:21   ` Will Deacon
  0 siblings, 0 replies; 26+ messages in thread
From: Will Deacon @ 2018-10-15 17:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
> ITS translation register map:
> 0x0000-0x003C	Reserved
> 0x0040		GITS_TRANSLATER
> 0x0044-0xFFFC	Reserved
> 
> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
> data will be written to MSIAddress each time.
> 
> MSIAddr: |----4bytes----|----4bytes----|
> 	 |    MSIData   |    IMPDEF    |
> 
> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
> But it will overwrite the 4 bytes memory following "sync_count". It's very
> luckly that the previous and the next neighbour of "sync_count" are both aligned
> by 8 bytes, so no problem is met now.
> 
> It's good to explicitly add a workaround:
> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>    aligned by 8 bytes.
> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
> 
> There is no functional change.
> 
> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> ---
>  drivers/iommu/arm-smmu-v3.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> index 5059d09..a07bc0d 100644
> --- a/drivers/iommu/arm-smmu-v3.c
> +++ b/drivers/iommu/arm-smmu-v3.c
> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>  
>  	struct arm_smmu_strtab_cfg	strtab_cfg;
>  
> +	union {
> +	u64				padding; /* workaround for Hisilicon */
>  	u32				sync_count;
> +	} __attribute__((aligned(8)));

Won't this already be aligned by the ABI?

Anyway, you'll need to swizzle things for big-endian, I suspect. Maybe you
can do something clever like making sync_count an array of two elements
and determining the offset based on the endianness. Or just keep it simple
like we do for things like struct qrwlock and struct qspinlock and use
#ifdefs.

Also -- you need a comment to explain this insanity :)

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
  2018-10-15 17:21   ` Will Deacon
@ 2018-10-15 17:36     ` Robin Murphy
  -1 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2018-10-15 17:36 UTC (permalink / raw)
  To: Will Deacon, Zhen Lei
  Cc: Joerg Roedel, linux-arm-kernel, iommu, linux-kernel, LinuxArm

On 15/10/18 18:21, Will Deacon wrote:
> On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
>> ITS translation register map:
>> 0x0000-0x003C	Reserved
>> 0x0040		GITS_TRANSLATER
>> 0x0044-0xFFFC	Reserved
>>
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |----4bytes----|----4bytes----|
>> 	 |    MSIData   |    IMPDEF    |
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
>> luckly that the previous and the next neighbour of "sync_count" are both aligned
>> by 8 bytes, so no problem is met now.
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>>     aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
>>
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>   drivers/iommu/arm-smmu-v3.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>   
>>   	struct arm_smmu_strtab_cfg	strtab_cfg;
>>   
>> +	union {
>> +	u64				padding; /* workaround for Hisilicon */
>>   	u32				sync_count;
>> +	} __attribute__((aligned(8)));
> 
> Won't this already be aligned by the ABI?
> 
> Anyway, you'll need to swizzle things for big-endian, I suspect. Maybe you
> can do something clever like making sync_count an array of two elements
> and determining the offset based on the endianness. Or just keep it simple
> like we do for things like struct qrwlock and struct qspinlock and use
> #ifdefs.

I don't think so - the CPUs should only ever be making word accesses to 
the u32 member, while the SMMU expects to be writing little-endian data 
to an ITS, so AFAICS the data word will always be at the lower address 
either way.

Although now that it's come up, the pre-existing issue of whether the 
byte order *within* that u32 comes out correct after its round-trip 
through the SMMU is something I need to run away and hurriedly think 
about...

Robin.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-15 17:36     ` Robin Murphy
  0 siblings, 0 replies; 26+ messages in thread
From: Robin Murphy @ 2018-10-15 17:36 UTC (permalink / raw)
  To: linux-arm-kernel

On 15/10/18 18:21, Will Deacon wrote:
> On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
>> ITS translation register map:
>> 0x0000-0x003C	Reserved
>> 0x0040		GITS_TRANSLATER
>> 0x0044-0xFFFC	Reserved
>>
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |----4bytes----|----4bytes----|
>> 	 |    MSIData   |    IMPDEF    |
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
>> luckly that the previous and the next neighbour of "sync_count" are both aligned
>> by 8 bytes, so no problem is met now.
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>>     aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
>>
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>   drivers/iommu/arm-smmu-v3.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>   
>>   	struct arm_smmu_strtab_cfg	strtab_cfg;
>>   
>> +	union {
>> +	u64				padding; /* workaround for Hisilicon */
>>   	u32				sync_count;
>> +	} __attribute__((aligned(8)));
> 
> Won't this already be aligned by the ABI?
> 
> Anyway, you'll need to swizzle things for big-endian, I suspect. Maybe you
> can do something clever like making sync_count an array of two elements
> and determining the offset based on the endianness. Or just keep it simple
> like we do for things like struct qrwlock and struct qspinlock and use
> #ifdefs.

I don't think so - the CPUs should only ever be making word accesses to 
the u32 member, while the SMMU expects to be writing little-endian data 
to an ITS, so AFAICS the data word will always be at the lower address 
either way.

Although now that it's come up, the pre-existing issue of whether the 
byte order *within* that u32 comes out correct after its round-trip 
through the SMMU is something I need to run away and hurriedly think 
about...

Robin.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
  2018-10-15 11:17   ` John Garry
@ 2018-10-16  9:19     ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 26+ messages in thread
From: Leizhen (ThunderTown) @ 2018-10-16  9:19 UTC (permalink / raw)
  To: John Garry, Robin Murphy, Will Deacon, Joerg Roedel,
	linux-arm-kernel, iommu, linux-kernel
  Cc: LinuxArm



On 2018/10/15 19:17, John Garry wrote:
> On 15/10/2018 09:36, Zhen Lei wrote:
>> ITS translation register map:
>> 0x0000-0x003C    Reserved
>> 0x0040        GITS_TRANSLATER
>> 0x0044-0xFFFC    Reserved
>>
> 
> Can you add a better opening than the ITS translation register map?

OK

> 
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |----4bytes----|----4bytes----|
>>      |    MSIData   |    IMPDEF    |
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
> 
> I think arm_smmu_device.sync_count is better, or "sync_count member in the the smmu driver control struct".

OK, I will use "struct" in v2.

+	struct {
 	u32				sync_count;
+	u32				padding;
+	} __attribute__((aligned(8)));

> 
>> luckly that the previous and the next neighbour of "sync_count" are both aligned
> 
> /s/luckly/luckily or fortunately/

OK, thanks

> 
>> by 8 bytes, so no problem is met now.
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>>    aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
>>
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>
>>      struct arm_smmu_strtab_cfg    strtab_cfg;
>>
>> +    union {
>> +    u64                padding; /* workaround for Hisilicon */
> 
> I think that a more detailed comment is required.

OK, I will try to describe it more clearly.

> 
>>      u32                sync_count;
> 
> Can you indent these 2 members? However - as discussed internally - this may have endian issue so better to declare full 64b struct.

These indent is inherited, to keep aligning with other members.

There is no endian issue, I have tested it on both little-endian and big-endian.

$gdb vmlinux
......
(gdb) p &((struct arm_smmu_device *)0)->sync_count
$1 = (u32 *) 0x4178
(gdb) p &((struct arm_smmu_device *)0)->tst1
$2 = (int *) 0x4170
(gdb) p &((struct arm_smmu_device *)0)->tst2
$3 = (int *) 0x4180

------------testcase--------

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 5059d09..7c6f7ac 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -586,7 +586,14 @@ struct arm_smmu_device {

        struct arm_smmu_strtab_cfg      strtab_cfg;

+ int                         tst1;
+
+ union {
+ u64                         padding;
        u32                             sync_count;
+ } __attribute__((aligned(8)));
+
+ int                         tst2;

        /* IOMMU core code handle */
        struct iommu_device             iommu;

> 
>> +    } __attribute__((aligned(8)));
>>
>>      /* IOMMU core code handle */
>>      struct iommu_device        iommu;
>>
> Thanks
> 
> 
> 
> 
> .
> 

-- 
Thanks!
BestRegards


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-16  9:19     ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 26+ messages in thread
From: Leizhen (ThunderTown) @ 2018-10-16  9:19 UTC (permalink / raw)
  To: linux-arm-kernel



On 2018/10/15 19:17, John Garry wrote:
> On 15/10/2018 09:36, Zhen Lei wrote:
>> ITS translation register map:
>> 0x0000-0x003C    Reserved
>> 0x0040        GITS_TRANSLATER
>> 0x0044-0xFFFC    Reserved
>>
> 
> Can you add a better opening than the ITS translation register map?

OK

> 
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |----4bytes----|----4bytes----|
>>      |    MSIData   |    IMPDEF    |
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
> 
> I think arm_smmu_device.sync_count is better, or "sync_count member in the the smmu driver control struct".

OK, I will use "struct" in v2.

+	struct {
 	u32				sync_count;
+	u32				padding;
+	} __attribute__((aligned(8)));

> 
>> luckly that the previous and the next neighbour of "sync_count" are both aligned
> 
> /s/luckly/luckily or fortunately/

OK, thanks

> 
>> by 8 bytes, so no problem is met now.
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>>    aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
>>
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>
>>      struct arm_smmu_strtab_cfg    strtab_cfg;
>>
>> +    union {
>> +    u64                padding; /* workaround for Hisilicon */
> 
> I think that a more detailed comment is required.

OK, I will try to describe it more clearly.

> 
>>      u32                sync_count;
> 
> Can you indent these 2 members? However - as discussed internally - this may have endian issue so better to declare full 64b struct.

These indent is inherited, to keep aligning with other members.

There is no endian issue, I have tested it on both little-endian and big-endian.

$gdb vmlinux
......
(gdb) p &((struct arm_smmu_device *)0)->sync_count
$1 = (u32 *) 0x4178
(gdb) p &((struct arm_smmu_device *)0)->tst1
$2 = (int *) 0x4170
(gdb) p &((struct arm_smmu_device *)0)->tst2
$3 = (int *) 0x4180

------------testcase--------

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 5059d09..7c6f7ac 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -586,7 +586,14 @@ struct arm_smmu_device {

        struct arm_smmu_strtab_cfg      strtab_cfg;

+ int                         tst1;
+
+ union {
+ u64                         padding;
        u32                             sync_count;
+ } __attribute__((aligned(8)));
+
+ int                         tst2;

        /* IOMMU core code handle */
        struct iommu_device             iommu;

> 
>> +    } __attribute__((aligned(8)));
>>
>>      /* IOMMU core code handle */
>>      struct iommu_device        iommu;
>>
> Thanks
> 
> 
> 
> 
> .
> 

-- 
Thanks!
BestRegards

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
  2018-10-15 13:52   ` Robin Murphy
  (?)
@ 2018-10-16  9:27     ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 26+ messages in thread
From: Leizhen (ThunderTown) @ 2018-10-16  9:27 UTC (permalink / raw)
  To: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
	linux-kernel
  Cc: LinuxArm, nd



On 2018/10/15 21:52, Robin Murphy wrote:
> On 15/10/18 09:36, Zhen Lei wrote:
>> ITS translation register map:
>> 0x0000-0x003C    Reserved
>> 0x0040        GITS_TRANSLATER
>> 0x0044-0xFFFC    Reserved
>>
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |----4bytes----|----4bytes----|
>>      |    MSIData   |    IMPDEF    |
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
>> luckly that the previous and the next neighbour of "sync_count" are both aligned
>> by 8 bytes, so no problem is met now.
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>>     aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
> 
> Surely the u64 member inherently makes the union, and thus the u32 member as well, 64-bit-aligned anyway?

Yes, we really only need one step, "1." or "2.". As John Garry suggested, I will change it to "struct" mode.

> 
> Robin.
> 
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>   drivers/iommu/arm-smmu-v3.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>         struct arm_smmu_strtab_cfg    strtab_cfg;
>>   +    union {
>> +    u64                padding; /* workaround for Hisilicon */
>>       u32                sync_count;
>> +    } __attribute__((aligned(8)));
>>         /* IOMMU core code handle */
>>       struct iommu_device        iommu;
>>
> 
> 

-- 
Thanks!
BestRegards


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-16  9:27     ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 26+ messages in thread
From: Leizhen (ThunderTown) @ 2018-10-16  9:27 UTC (permalink / raw)
  To: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
	linux-kernel
  Cc: LinuxArm, nd



On 2018/10/15 21:52, Robin Murphy wrote:
> On 15/10/18 09:36, Zhen Lei wrote:
>> ITS translation register map:
>> 0x0000-0x003C    Reserved
>> 0x0040        GITS_TRANSLATER
>> 0x0044-0xFFFC    Reserved
>>
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |----4bytes----|----4bytes----|
>>      |    MSIData   |    IMPDEF    |
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
>> luckly that the previous and the next neighbour of "sync_count" are both aligned
>> by 8 bytes, so no problem is met now.
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>>     aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
> 
> Surely the u64 member inherently makes the union, and thus the u32 member as well, 64-bit-aligned anyway?

Yes, we really only need one step, "1." or "2.". As John Garry suggested, I will change it to "struct" mode.

> 
> Robin.
> 
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>   drivers/iommu/arm-smmu-v3.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>         struct arm_smmu_strtab_cfg    strtab_cfg;
>>   +    union {
>> +    u64                padding; /* workaround for Hisilicon */
>>       u32                sync_count;
>> +    } __attribute__((aligned(8)));
>>         /* IOMMU core code handle */
>>       struct iommu_device        iommu;
>>
> 
> 

-- 
Thanks!
BestRegards

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-16  9:27     ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 26+ messages in thread
From: Leizhen (ThunderTown) @ 2018-10-16  9:27 UTC (permalink / raw)
  To: linux-arm-kernel



On 2018/10/15 21:52, Robin Murphy wrote:
> On 15/10/18 09:36, Zhen Lei wrote:
>> ITS translation register map:
>> 0x0000-0x003C    Reserved
>> 0x0040        GITS_TRANSLATER
>> 0x0044-0xFFFC    Reserved
>>
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |----4bytes----|----4bytes----|
>>      |    MSIData   |    IMPDEF    |
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
>> luckly that the previous and the next neighbour of "sync_count" are both aligned
>> by 8 bytes, so no problem is met now.
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>>     aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
> 
> Surely the u64 member inherently makes the union, and thus the u32 member as well, 64-bit-aligned anyway?

Yes, we really only need one step, "1." or "2.". As John Garry suggested, I will change it to "struct" mode.

> 
> Robin.
> 
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>   drivers/iommu/arm-smmu-v3.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>         struct arm_smmu_strtab_cfg    strtab_cfg;
>>   +    union {
>> +    u64                padding; /* workaround for Hisilicon */
>>       u32                sync_count;
>> +    } __attribute__((aligned(8)));
>>         /* IOMMU core code handle */
>>       struct iommu_device        iommu;
>>
> 
> 

-- 
Thanks!
BestRegards

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
  2018-10-15 17:21   ` Will Deacon
@ 2018-10-16  9:41     ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 26+ messages in thread
From: Leizhen (ThunderTown) @ 2018-10-16  9:41 UTC (permalink / raw)
  To: Will Deacon
  Cc: Robin Murphy, Joerg Roedel, linux-arm-kernel, iommu,
	linux-kernel, LinuxArm



On 2018/10/16 1:21, Will Deacon wrote:
> On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
>> ITS translation register map:
>> 0x0000-0x003C	Reserved
>> 0x0040		GITS_TRANSLATER
>> 0x0044-0xFFFC	Reserved
>>
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |----4bytes----|----4bytes----|
>> 	 |    MSIData   |    IMPDEF    |
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
>> luckly that the previous and the next neighbour of "sync_count" are both aligned
>> by 8 bytes, so no problem is met now.
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>>    aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
>>
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>  
>>  	struct arm_smmu_strtab_cfg	strtab_cfg;
>>  
>> +	union {
>> +	u64				padding; /* workaround for Hisilicon */
>>  	u32				sync_count;
>> +	} __attribute__((aligned(8)));
> 
> Won't this already be aligned by the ABI?
> 
> Anyway, you'll need to swizzle things for big-endian, I suspect. Maybe you
> can do something clever like making sync_count an array of two elements
> and determining the offset based on the endianness. Or just keep it simple
> like we do for things like struct qrwlock and struct qspinlock and use
> #ifdefs.

This workaround is a special case, the sync_count is only written by ITS hardware,
and is only read by software. Although Hisilicon ITS will write 8 bytes at
MSIAddress(required it aligned by 8 bytes), but it can sure that the value of
MSIdata will be written at the lower 4 bytes(the start address of sync_count).
Because the type of sync_count is u32, so that CPU is also read the 4 bytes at
the lower address.

> 
> Also -- you need a comment to explain this insanity :)
> 
> Will
> 
> .
> 

-- 
Thanks!
BestRegards


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-16  9:41     ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 26+ messages in thread
From: Leizhen (ThunderTown) @ 2018-10-16  9:41 UTC (permalink / raw)
  To: linux-arm-kernel



On 2018/10/16 1:21, Will Deacon wrote:
> On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
>> ITS translation register map:
>> 0x0000-0x003C	Reserved
>> 0x0040		GITS_TRANSLATER
>> 0x0044-0xFFFC	Reserved
>>
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |----4bytes----|----4bytes----|
>> 	 |    MSIData   |    IMPDEF    |
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
>> luckly that the previous and the next neighbour of "sync_count" are both aligned
>> by 8 bytes, so no problem is met now.
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>>    aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
>>
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>  
>>  	struct arm_smmu_strtab_cfg	strtab_cfg;
>>  
>> +	union {
>> +	u64				padding; /* workaround for Hisilicon */
>>  	u32				sync_count;
>> +	} __attribute__((aligned(8)));
> 
> Won't this already be aligned by the ABI?
> 
> Anyway, you'll need to swizzle things for big-endian, I suspect. Maybe you
> can do something clever like making sync_count an array of two elements
> and determining the offset based on the endianness. Or just keep it simple
> like we do for things like struct qrwlock and struct qspinlock and use
> #ifdefs.

This workaround is a special case, the sync_count is only written by ITS hardware,
and is only read by software. Although Hisilicon ITS will write 8 bytes at
MSIAddress(required it aligned by 8 bytes), but it can sure that the value of
MSIdata will be written at the lower 4 bytes(the start address of sync_count).
Because the type of sync_count is u32, so that CPU is also read the 4 bytes at
the lower address.

> 
> Also -- you need a comment to explain this insanity :)
> 
> Will
> 
> .
> 

-- 
Thanks!
BestRegards

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
  2018-10-15 12:46   ` Andrew Murray
@ 2018-10-16 10:02     ` Leizhen (ThunderTown)
  -1 siblings, 0 replies; 26+ messages in thread
From: Leizhen (ThunderTown) @ 2018-10-16 10:02 UTC (permalink / raw)
  To: Andrew Murray
  Cc: Robin Murphy, Will Deacon, Joerg Roedel, linux-arm-kernel, iommu,
	linux-kernel, LinuxArm



On 2018/10/15 20:46, Andrew Murray wrote:
> Hi Zhen,
> 
> On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
>> ITS translation register map:
>> 0x0000-0x003C	Reserved
>> 0x0040		GITS_TRANSLATER
>> 0x0044-0xFFFC	Reserved
>>
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |----4bytes----|----4bytes----|
>> 	 |    MSIData   |    IMPDEF    |
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
>> luckly that the previous and the next neighbour of "sync_count" are both aligned
>> by 8 bytes, so no problem is met now.
> 
> My understanding is that MSI's are 32bit memory writes and as such the SMMU
> performs a 32bit write in response to the MSI. If so then what is different
> with the Hi16xx that causes a problem? Have you been able to able to adjust
> the layout of the arm_smmu_device struct to demonstrate this?

In normal, only 32bits MSIdata will be written into sync_count:
|----4bytes----|----4bytes----|
|  sync_count  |     xxxx     |

But for Hi16xx, the ITS hardware will write extra 32bits IMDDEF data into "xxxx". If
"xxxx" is the space of the next struct member, its value will be overwritten.

> 
> Thanks,
> 
> Andrew Murray
> 
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>>    aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
>>
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>  
>>  	struct arm_smmu_strtab_cfg	strtab_cfg;
>>  
>> +	union {
>> +	u64				padding; /* workaround for Hisilicon */
>>  	u32				sync_count;
>> +	} __attribute__((aligned(8)));
>>  
>>  	/* IOMMU core code handle */
>>  	struct iommu_device		iommu;
>> -- 
>> 1.8.3
>>
>>
>> _______________________________________________
>> iommu mailing list
>> iommu@lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 
> .
> 

-- 
Thanks!
BestRegards


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-16 10:02     ` Leizhen (ThunderTown)
  0 siblings, 0 replies; 26+ messages in thread
From: Leizhen (ThunderTown) @ 2018-10-16 10:02 UTC (permalink / raw)
  To: linux-arm-kernel



On 2018/10/15 20:46, Andrew Murray wrote:
> Hi Zhen,
> 
> On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
>> ITS translation register map:
>> 0x0000-0x003C	Reserved
>> 0x0040		GITS_TRANSLATER
>> 0x0044-0xFFFC	Reserved
>>
>> The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
>> expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
>> data will be written to MSIAddress each time.
>>
>> MSIAddr: |----4bytes----|----4bytes----|
>> 	 |    MSIData   |    IMPDEF    |
>>
>> There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
>> But it will overwrite the 4 bytes memory following "sync_count". It's very
>> luckly that the previous and the next neighbour of "sync_count" are both aligned
>> by 8 bytes, so no problem is met now.
> 
> My understanding is that MSI's are 32bit memory writes and as such the SMMU
> performs a 32bit write in response to the MSI. If so then what is different
> with the Hi16xx that causes a problem? Have you been able to able to adjust
> the layout of the arm_smmu_device struct to demonstrate this?

In normal, only 32bits MSIdata will be written into sync_count:
|----4bytes----|----4bytes----|
|  sync_count  |     xxxx     |

But for Hi16xx, the ITS hardware will write extra 32bits IMDDEF data into "xxxx". If
"xxxx" is the space of the next struct member, its value will be overwritten.

> 
> Thanks,
> 
> Andrew Murray
> 
>>
>> It's good to explicitly add a workaround:
>> 1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
>>    aligned by 8 bytes.
>> 2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
>>
>> There is no functional change.
>>
>> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
>> ---
>>  drivers/iommu/arm-smmu-v3.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
>> index 5059d09..a07bc0d 100644
>> --- a/drivers/iommu/arm-smmu-v3.c
>> +++ b/drivers/iommu/arm-smmu-v3.c
>> @@ -586,7 +586,10 @@ struct arm_smmu_device {
>>  
>>  	struct arm_smmu_strtab_cfg	strtab_cfg;
>>  
>> +	union {
>> +	u64				padding; /* workaround for Hisilicon */
>>  	u32				sync_count;
>> +	} __attribute__((aligned(8)));
>>  
>>  	/* IOMMU core code handle */
>>  	struct iommu_device		iommu;
>> -- 
>> 1.8.3
>>
>>
>> _______________________________________________
>> iommu mailing list
>> iommu at lists.linux-foundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 
> .
> 

-- 
Thanks!
BestRegards

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
  2018-10-15 17:36     ` Robin Murphy
@ 2018-10-16 10:08       ` Will Deacon
  -1 siblings, 0 replies; 26+ messages in thread
From: Will Deacon @ 2018-10-16 10:08 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Zhen Lei, Joerg Roedel, linux-arm-kernel, iommu, linux-kernel, LinuxArm

On Mon, Oct 15, 2018 at 06:36:52PM +0100, Robin Murphy wrote:
> On 15/10/18 18:21, Will Deacon wrote:
> >On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
> >>ITS translation register map:
> >>0x0000-0x003C	Reserved
> >>0x0040		GITS_TRANSLATER
> >>0x0044-0xFFFC	Reserved
> >>
> >>The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
> >>expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
> >>data will be written to MSIAddress each time.
> >>
> >>MSIAddr: |----4bytes----|----4bytes----|
> >>	 |    MSIData   |    IMPDEF    |
> >>
> >>There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
> >>But it will overwrite the 4 bytes memory following "sync_count". It's very
> >>luckly that the previous and the next neighbour of "sync_count" are both aligned
> >>by 8 bytes, so no problem is met now.
> >>
> >>It's good to explicitly add a workaround:
> >>1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
> >>    aligned by 8 bytes.
> >>2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
> >>
> >>There is no functional change.
> >>
> >>Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> >>---
> >>  drivers/iommu/arm-smmu-v3.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >>diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> >>index 5059d09..a07bc0d 100644
> >>--- a/drivers/iommu/arm-smmu-v3.c
> >>+++ b/drivers/iommu/arm-smmu-v3.c
> >>@@ -586,7 +586,10 @@ struct arm_smmu_device {
> >>  	struct arm_smmu_strtab_cfg	strtab_cfg;
> >>+	union {
> >>+	u64				padding; /* workaround for Hisilicon */
> >>  	u32				sync_count;
> >>+	} __attribute__((aligned(8)));
> >
> >Won't this already be aligned by the ABI?
> >
> >Anyway, you'll need to swizzle things for big-endian, I suspect. Maybe you
> >can do something clever like making sync_count an array of two elements
> >and determining the offset based on the endianness. Or just keep it simple
> >like we do for things like struct qrwlock and struct qspinlock and use
> >#ifdefs.
> 
> I don't think so - the CPUs should only ever be making word accesses to the
> u32 member, while the SMMU expects to be writing little-endian data to an
> ITS, so AFAICS the data word will always be at the lower address either way.

Yes, thanks. I'd actually got the union layout wrong in my head and not
realised that both member will start at the same address. Ok, so that works.

> Although now that it's come up, the pre-existing issue of whether the byte
> order *within* that u32 comes out correct after its round-trip through the
> SMMU is something I need to run away and hurriedly think about...

Ha, oops! Everything going into the command queue is swabbed, so I guess we
need to swab the data again in smp_cond_load_acquire(). I'll leave it with
you...

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc
@ 2018-10-16 10:08       ` Will Deacon
  0 siblings, 0 replies; 26+ messages in thread
From: Will Deacon @ 2018-10-16 10:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Oct 15, 2018 at 06:36:52PM +0100, Robin Murphy wrote:
> On 15/10/18 18:21, Will Deacon wrote:
> >On Mon, Oct 15, 2018 at 04:36:16PM +0800, Zhen Lei wrote:
> >>ITS translation register map:
> >>0x0000-0x003C	Reserved
> >>0x0040		GITS_TRANSLATER
> >>0x0044-0xFFFC	Reserved
> >>
> >>The standard GITS_TRANSLATER register in ITS is only 4 bytes, but Hisilicon
> >>expands the next 4 bytes to carry some IMPDEF information. That means, 8 bytes
> >>data will be written to MSIAddress each time.
> >>
> >>MSIAddr: |----4bytes----|----4bytes----|
> >>	 |    MSIData   |    IMPDEF    |
> >>
> >>There is no problem for ITS, because the next 4 bytes space is reserved in ITS.
> >>But it will overwrite the 4 bytes memory following "sync_count". It's very
> >>luckly that the previous and the next neighbour of "sync_count" are both aligned
> >>by 8 bytes, so no problem is met now.
> >>
> >>It's good to explicitly add a workaround:
> >>1. Add gcc __attribute__((aligned(8))) to make sure that "sync_count" is always
> >>    aligned by 8 bytes.
> >>2. Add a "u64" union member to make sure the 4 bytes padding is always exist.
> >>
> >>There is no functional change.
> >>
> >>Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
> >>---
> >>  drivers/iommu/arm-smmu-v3.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >>diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
> >>index 5059d09..a07bc0d 100644
> >>--- a/drivers/iommu/arm-smmu-v3.c
> >>+++ b/drivers/iommu/arm-smmu-v3.c
> >>@@ -586,7 +586,10 @@ struct arm_smmu_device {
> >>  	struct arm_smmu_strtab_cfg	strtab_cfg;
> >>+	union {
> >>+	u64				padding; /* workaround for Hisilicon */
> >>  	u32				sync_count;
> >>+	} __attribute__((aligned(8)));
> >
> >Won't this already be aligned by the ABI?
> >
> >Anyway, you'll need to swizzle things for big-endian, I suspect. Maybe you
> >can do something clever like making sync_count an array of two elements
> >and determining the offset based on the endianness. Or just keep it simple
> >like we do for things like struct qrwlock and struct qspinlock and use
> >#ifdefs.
> 
> I don't think so - the CPUs should only ever be making word accesses to the
> u32 member, while the SMMU expects to be writing little-endian data to an
> ITS, so AFAICS the data word will always be at the lower address either way.

Yes, thanks. I'd actually got the union layout wrong in my head and not
realised that both member will start at the same address. Ok, so that works.

> Although now that it's come up, the pre-existing issue of whether the byte
> order *within* that u32 comes out correct after its round-trip through the
> SMMU is something I need to run away and hurriedly think about...

Ha, oops! Everything going into the command queue is swabbed, so I guess we
need to swab the data again in smp_cond_load_acquire(). I'll leave it with
you...

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2018-10-16 10:08 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-15  8:36 [PATCH 1/1] iommu/arm-smmu-v3: eliminate a potential memory corruption on Hi16xx soc Zhen Lei
2018-10-15  8:36 ` Zhen Lei
2018-10-15  8:36 ` Zhen Lei
2018-10-15 11:17 ` John Garry
2018-10-15 11:17   ` John Garry
2018-10-16  9:19   ` Leizhen (ThunderTown)
2018-10-16  9:19     ` Leizhen (ThunderTown)
2018-10-15 12:46 ` Andrew Murray
2018-10-15 12:46   ` Andrew Murray
2018-10-15 12:46   ` Andrew Murray
2018-10-16 10:02   ` Leizhen (ThunderTown)
2018-10-16 10:02     ` Leizhen (ThunderTown)
2018-10-15 13:52 ` Robin Murphy
2018-10-15 13:52   ` Robin Murphy
2018-10-15 13:52   ` Robin Murphy
2018-10-16  9:27   ` Leizhen (ThunderTown)
2018-10-16  9:27     ` Leizhen (ThunderTown)
2018-10-16  9:27     ` Leizhen (ThunderTown)
2018-10-15 17:21 ` Will Deacon
2018-10-15 17:21   ` Will Deacon
2018-10-15 17:36   ` Robin Murphy
2018-10-15 17:36     ` Robin Murphy
2018-10-16 10:08     ` Will Deacon
2018-10-16 10:08       ` Will Deacon
2018-10-16  9:41   ` Leizhen (ThunderTown)
2018-10-16  9:41     ` Leizhen (ThunderTown)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.