IOMMU Archive on lore.kernel.org
 help / color / Atom feed
* arm64 iommu groups issue
@ 2019-09-19  8:43 John Garry
  2019-09-19 13:25 ` Robin Murphy
  0 siblings, 1 reply; 4+ messages in thread
From: John Garry @ 2019-09-19  8:43 UTC (permalink / raw)
  To: Robin Murphy, Marc Zyngier, Will Deacon, Lorenzo Pieralisi,
	Sudeep Holla, Guohanjun (Hanjun Guo)
  Cc: linux-kernel, Alex Williamson, Linuxarm, iommu, Bjorn Helgaas,
	linux-arm-kernel

Hi all,

We have noticed a special behaviour on our arm64 D05 board when the SMMU 
is enabled with regards PCI device iommu groups.

This platform does not support ACS, yet we find that all functions for a 
PCI device are not grouped together:

root@ubuntu:/sys# dmesg | grep "Adding to iommu group"
[    7.307539] hisi_sas_v2_hw HISI0162:01: Adding to iommu group 0
[   12.590533] hns_dsaf HISI00B2:00: Adding to iommu group 1
[   13.688527] mlx5_core 000a:11:00.0: Adding to iommu group 2
[   14.324606] mlx5_core 000a:11:00.1: Adding to iommu group 3
[   14.937090] ehci-platform PNP0D20:00: Adding to iommu group 4
[   15.276637] pcieport 0002:f8:00.0: Adding to iommu group 5
[   15.340845] pcieport 0004:88:00.0: Adding to iommu group 6
[   15.392098] pcieport 0005:78:00.0: Adding to iommu group 7
[   15.443356] pcieport 000a:10:00.0: Adding to iommu group 8
[   15.484975] pcieport 000c:20:00.0: Adding to iommu group 9
[   15.543647] pcieport 000d:30:00.0: Adding to iommu group 10
[   15.599771] serial 0002:f9:00.0: Adding to iommu group 5
[   15.690807] serial 0002:f9:00.1: Adding to iommu group 5
[   84.322097] mlx5_core 000a:11:00.2: Adding to iommu group 8
[   84.856408] mlx5_core 000a:11:00.3: Adding to iommu group 8

root@ubuntu:/sys#  lspci -tv
lspci -tvv
-+-[000d:30]---00.0-[31]--
   +-[000c:20]---00.0-[21]----00.0  Huawei Technologies Co., Ltd.
   +-[000a:10]---00.0-[11-12]--+-00.0  Mellanox [ConnectX-5]
   |                           +-00.1  Mellanox [ConnectX-5]
   |                           +-00.2  Mellanox [ConnectX-5 VF]
   |                           \-00.3  Mellanox [ConnectX-5 VF]
   +-[0007:90]---00.0-[91]----00.0  Huawei Technologies Co., ...
   +-[0006:c0]---00.0-[c1]--
   +-[0005:78]---00.0-[79]--
   +-[0004:88]---00.0-[89]--
   +-[0002:f8]---00.0-[f9]--+-00.0  MosChip Semiconductor Technology ...
   |                        +-00.1  MosChip Semiconductor Technology ...
   |                        \-00.2  MosChip Semiconductor Technology ...
   \-[0000:00]-

For the PCI devices in question - on port 000a:10:00.0 - you will notice 
that the port and VFs (000a:11:00.2, 3) are in one group, yet the 2 PFs 
(000a:11:00.0, 000a:11:00.1) are in separate groups.

I also notice the same ordering nature on our D06 platform - the 
pcieport is added to an iommu group after PF for that port. However this 
platform supports ACS, so not such a problem.

After some checking, I find that when the pcieport driver probes, the 
associated SMMU device had not registered yet with the IOMMU framework, 
so we defer the probe for this device - in iort.c:iort_iommu_xlate(), 
when no iommu ops are available, we defer.

Yet, when the mlx5 PF devices probe, the iommu ops are available at this 
stage. So the probe continues and we get an iommu group for the device - 
but not the same group as the parent port, as it has not yet been added 
to a group. When the port eventually probes it gets a new, separate group.

This all seems to be as the built-in module init ordering is as follows: 
pcieport drv, smmu drv, mlx5 drv

I notice that if I build the mlx5 drv as a ko and insert after boot, all 
functions + pcieport are in the same group:

[   11.530046] hisi_sas_v2_hw HISI0162:01: Adding to iommu group 0
[   17.301093] hns_dsaf HISI00B2:00: Adding to iommu group 1
[   18.743600] ehci-platform PNP0D20:00: Adding to iommu group 2
[   20.212284] pcieport 0002:f8:00.0: Adding to iommu group 3
[   20.356303] pcieport 0004:88:00.0: Adding to iommu group 4
[   20.493337] pcieport 0005:78:00.0: Adding to iommu group 5
[   20.702999] pcieport 000a:10:00.0: Adding to iommu group 6
[   20.859183] pcieport 000c:20:00.0: Adding to iommu group 7
[   20.996140] pcieport 000d:30:00.0: Adding to iommu group 8
[   21.152637] serial 0002:f9:00.0: Adding to iommu group 3
[   21.346991] serial 0002:f9:00.1: Adding to iommu group 3
[  100.754306] mlx5_core 000a:11:00.0: Adding to iommu group 6
[  101.420156] mlx5_core 000a:11:00.1: Adding to iommu group 6
[  292.481714] mlx5_core 000a:11:00.2: Adding to iommu group 6
[  293.281061] mlx5_core 000a:11:00.3: Adding to iommu group 6

This does seem like a problem for arm64 platforms which don't support 
ACS, yet enable an SMMU. Maybe also a problem even if they do support ACS.

Opinion?

Thanks,
John

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: arm64 iommu groups issue
  2019-09-19  8:43 arm64 iommu groups issue John Garry
@ 2019-09-19 13:25 ` Robin Murphy
  2019-09-19 14:35   ` John Garry
  0 siblings, 1 reply; 4+ messages in thread
From: Robin Murphy @ 2019-09-19 13:25 UTC (permalink / raw)
  To: John Garry, Marc Zyngier, Will Deacon, Lorenzo Pieralisi,
	Sudeep Holla, Guohanjun (Hanjun Guo)
  Cc: linux-kernel, Alex Williamson, Linuxarm, iommu, Bjorn Helgaas,
	linux-arm-kernel

Hi John,

On 19/09/2019 09:43, John Garry wrote:
> Hi all,
> 
> We have noticed a special behaviour on our arm64 D05 board when the SMMU 
> is enabled with regards PCI device iommu groups.
> 
> This platform does not support ACS, yet we find that all functions for a 
> PCI device are not grouped together:
> 
> root@ubuntu:/sys# dmesg | grep "Adding to iommu group"
> [    7.307539] hisi_sas_v2_hw HISI0162:01: Adding to iommu group 0
> [   12.590533] hns_dsaf HISI00B2:00: Adding to iommu group 1
> [   13.688527] mlx5_core 000a:11:00.0: Adding to iommu group 2
> [   14.324606] mlx5_core 000a:11:00.1: Adding to iommu group 3
> [   14.937090] ehci-platform PNP0D20:00: Adding to iommu group 4
> [   15.276637] pcieport 0002:f8:00.0: Adding to iommu group 5
> [   15.340845] pcieport 0004:88:00.0: Adding to iommu group 6
> [   15.392098] pcieport 0005:78:00.0: Adding to iommu group 7
> [   15.443356] pcieport 000a:10:00.0: Adding to iommu group 8
> [   15.484975] pcieport 000c:20:00.0: Adding to iommu group 9
> [   15.543647] pcieport 000d:30:00.0: Adding to iommu group 10
> [   15.599771] serial 0002:f9:00.0: Adding to iommu group 5
> [   15.690807] serial 0002:f9:00.1: Adding to iommu group 5
> [   84.322097] mlx5_core 000a:11:00.2: Adding to iommu group 8
> [   84.856408] mlx5_core 000a:11:00.3: Adding to iommu group 8
> 
> root@ubuntu:/sys#  lspci -tv
> lspci -tvv
> -+-[000d:30]---00.0-[31]--
>    +-[000c:20]---00.0-[21]----00.0  Huawei Technologies Co., Ltd.
>    +-[000a:10]---00.0-[11-12]--+-00.0  Mellanox [ConnectX-5]
>    |                           +-00.1  Mellanox [ConnectX-5]
>    |                           +-00.2  Mellanox [ConnectX-5 VF]
>    |                           \-00.3  Mellanox [ConnectX-5 VF]
>    +-[0007:90]---00.0-[91]----00.0  Huawei Technologies Co., ...
>    +-[0006:c0]---00.0-[c1]--
>    +-[0005:78]---00.0-[79]--
>    +-[0004:88]---00.0-[89]--
>    +-[0002:f8]---00.0-[f9]--+-00.0  MosChip Semiconductor Technology ...
>    |                        +-00.1  MosChip Semiconductor Technology ...
>    |                        \-00.2  MosChip Semiconductor Technology ...
>    \-[0000:00]-
> 
> For the PCI devices in question - on port 000a:10:00.0 - you will notice 
> that the port and VFs (000a:11:00.2, 3) are in one group, yet the 2 PFs 
> (000a:11:00.0, 000a:11:00.1) are in separate groups.
> 
> I also notice the same ordering nature on our D06 platform - the 
> pcieport is added to an iommu group after PF for that port. However this 
> platform supports ACS, so not such a problem.
> 
> After some checking, I find that when the pcieport driver probes, the 
> associated SMMU device had not registered yet with the IOMMU framework, 
> so we defer the probe for this device - in iort.c:iort_iommu_xlate(), 
> when no iommu ops are available, we defer.
> 
> Yet, when the mlx5 PF devices probe, the iommu ops are available at this 
> stage. So the probe continues and we get an iommu group for the device - 
> but not the same group as the parent port, as it has not yet been added 
> to a group. When the port eventually probes it gets a new, separate group.
> 
> This all seems to be as the built-in module init ordering is as follows: 
> pcieport drv, smmu drv, mlx5 drv
> 
> I notice that if I build the mlx5 drv as a ko and insert after boot, all 
> functions + pcieport are in the same group:
> 
> [   11.530046] hisi_sas_v2_hw HISI0162:01: Adding to iommu group 0
> [   17.301093] hns_dsaf HISI00B2:00: Adding to iommu group 1
> [   18.743600] ehci-platform PNP0D20:00: Adding to iommu group 2
> [   20.212284] pcieport 0002:f8:00.0: Adding to iommu group 3
> [   20.356303] pcieport 0004:88:00.0: Adding to iommu group 4
> [   20.493337] pcieport 0005:78:00.0: Adding to iommu group 5
> [   20.702999] pcieport 000a:10:00.0: Adding to iommu group 6
> [   20.859183] pcieport 000c:20:00.0: Adding to iommu group 7
> [   20.996140] pcieport 000d:30:00.0: Adding to iommu group 8
> [   21.152637] serial 0002:f9:00.0: Adding to iommu group 3
> [   21.346991] serial 0002:f9:00.1: Adding to iommu group 3
> [  100.754306] mlx5_core 000a:11:00.0: Adding to iommu group 6
> [  101.420156] mlx5_core 000a:11:00.1: Adding to iommu group 6
> [  292.481714] mlx5_core 000a:11:00.2: Adding to iommu group 6
> [  293.281061] mlx5_core 000a:11:00.3: Adding to iommu group 6
> 
> This does seem like a problem for arm64 platforms which don't support 
> ACS, yet enable an SMMU. Maybe also a problem even if they do support ACS.
> 
> Opinion?

Yeah, this is less than ideal. One way to bodge it might be to make 
pci_device_group() also walk downwards to see if any non-ACS-isolated 
children already have a group, rather than assuming that groups get 
allocated in hierarchical order, but that's far from ideal.

The underlying issue is that, for historical reasons, OF/IORT-based 
IOMMU drivers have ended up with group allocation being tied to endpoint 
driver probing via the dma_configure() mechanism (long story short, 
driver probe is the only thing which can be delayed in order to wait for 
a specific IOMMU instance to be ready). However, in the meantime, the 
IOMMU API internals have evolved sufficiently that I think there's a way 
to really put things right - I have the spark of an idea which I'll try 
to sketch out ASAP...

Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: arm64 iommu groups issue
  2019-09-19 13:25 ` Robin Murphy
@ 2019-09-19 14:35   ` John Garry
  2019-11-04 12:18     ` John Garry
  0 siblings, 1 reply; 4+ messages in thread
From: John Garry @ 2019-09-19 14:35 UTC (permalink / raw)
  To: Robin Murphy, Marc Zyngier, Will Deacon, Lorenzo Pieralisi,
	Sudeep Holla, Guohanjun (Hanjun Guo)
  Cc: linux-kernel, Alex Williamson, Linuxarm, iommu, Bjorn Helgaas,
	linux-arm-kernel

On 19/09/2019 14:25, Robin Murphy wrote:
>> When the port eventually probes it gets a new, separate group.
>>
>> This all seems to be as the built-in module init ordering is as
>> follows: pcieport drv, smmu drv, mlx5 drv
>>
>> I notice that if I build the mlx5 drv as a ko and insert after boot,
>> all functions + pcieport are in the same group:
>>
>> [   11.530046] hisi_sas_v2_hw HISI0162:01: Adding to iommu group 0
>> [   17.301093] hns_dsaf HISI00B2:00: Adding to iommu group 1
>> [   18.743600] ehci-platform PNP0D20:00: Adding to iommu group 2
>> [   20.212284] pcieport 0002:f8:00.0: Adding to iommu group 3
>> [   20.356303] pcieport 0004:88:00.0: Adding to iommu group 4
>> [   20.493337] pcieport 0005:78:00.0: Adding to iommu group 5
>> [   20.702999] pcieport 000a:10:00.0: Adding to iommu group 6
>> [   20.859183] pcieport 000c:20:00.0: Adding to iommu group 7
>> [   20.996140] pcieport 000d:30:00.0: Adding to iommu group 8
>> [   21.152637] serial 0002:f9:00.0: Adding to iommu group 3
>> [   21.346991] serial 0002:f9:00.1: Adding to iommu group 3
>> [  100.754306] mlx5_core 000a:11:00.0: Adding to iommu group 6
>> [  101.420156] mlx5_core 000a:11:00.1: Adding to iommu group 6
>> [  292.481714] mlx5_core 000a:11:00.2: Adding to iommu group 6
>> [  293.281061] mlx5_core 000a:11:00.3: Adding to iommu group 6
>>
>> This does seem like a problem for arm64 platforms which don't support
>> ACS, yet enable an SMMU. Maybe also a problem even if they do support
>> ACS.
>>
>> Opinion?
>

Hi Robin,

> Yeah, this is less than ideal.

For sure. Our production D05 boards don't ship with the SMMU enabled in 
BIOS, but it would be slightly concerning in this regard if they did.

 > One way to bodge it might be to make
> pci_device_group() also walk downwards to see if any non-ACS-isolated
> children already have a group, rather than assuming that groups get
> allocated in hierarchical order, but that's far from ideal.

Agree.

My own workaround was to hack the mentioned iort code to defer the PF 
probe if the parent port had also yet to probe.

>
> The underlying issue is that, for historical reasons, OF/IORT-based
> IOMMU drivers have ended up with group allocation being tied to endpoint
> driver probing via the dma_configure() mechanism (long story short,
> driver probe is the only thing which can be delayed in order to wait for
> a specific IOMMU instance to be ready).However, in the meantime, the
> IOMMU API internals have evolved sufficiently that I think there's a way
> to really put things right - I have the spark of an idea which I'll try
> to sketch out ASAP...
>

OK, great.

Thanks,
John

> Robin.


_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: arm64 iommu groups issue
  2019-09-19 14:35   ` John Garry
@ 2019-11-04 12:18     ` John Garry
  0 siblings, 0 replies; 4+ messages in thread
From: John Garry @ 2019-11-04 12:18 UTC (permalink / raw)
  To: Robin Murphy, Marc Zyngier, Will Deacon, Lorenzo Pieralisi,
	Sudeep Holla, Guohanjun (Hanjun Guo)
  Cc: Saravana Kannan, linux-kernel, Alex Williamson, Linuxarm, iommu,
	Bjorn Helgaas, linux-arm-kernel

+

On 19/09/2019 15:35, John Garry wrote:
> On 19/09/2019 14:25, Robin Murphy wrote:
>>> When the port eventually probes it gets a new, separate group.
>>>
>>> This all seems to be as the built-in module init ordering is as
>>> follows: pcieport drv, smmu drv, mlx5 drv
>>>
>>> I notice that if I build the mlx5 drv as a ko and insert after boot,
>>> all functions + pcieport are in the same group:
>>>
>>> [   11.530046] hisi_sas_v2_hw HISI0162:01: Adding to iommu group 0
>>> [   17.301093] hns_dsaf HISI00B2:00: Adding to iommu group 1
>>> [   18.743600] ehci-platform PNP0D20:00: Adding to iommu group 2
>>> [   20.212284] pcieport 0002:f8:00.0: Adding to iommu group 3
>>> [   20.356303] pcieport 0004:88:00.0: Adding to iommu group 4
>>> [   20.493337] pcieport 0005:78:00.0: Adding to iommu group 5
>>> [   20.702999] pcieport 000a:10:00.0: Adding to iommu group 6
>>> [   20.859183] pcieport 000c:20:00.0: Adding to iommu group 7
>>> [   20.996140] pcieport 000d:30:00.0: Adding to iommu group 8
>>> [   21.152637] serial 0002:f9:00.0: Adding to iommu group 3
>>> [   21.346991] serial 0002:f9:00.1: Adding to iommu group 3
>>> [  100.754306] mlx5_core 000a:11:00.0: Adding to iommu group 6
>>> [  101.420156] mlx5_core 000a:11:00.1: Adding to iommu group 6
>>> [  292.481714] mlx5_core 000a:11:00.2: Adding to iommu group 6
>>> [  293.281061] mlx5_core 000a:11:00.3: Adding to iommu group 6
>>>
>>> This does seem like a problem for arm64 platforms which don't support
>>> ACS, yet enable an SMMU. Maybe also a problem even if they do support
>>> ACS.
>>>
>>> Opinion?
>>
> 
> Hi Robin,
> 
>> Yeah, this is less than ideal.
> 
> For sure. Our production D05 boards don't ship with the SMMU enabled in 
> BIOS, but it would be slightly concerning in this regard if they did.
> 
>  > One way to bodge it might be to make
>> pci_device_group() also walk downwards to see if any non-ACS-isolated
>> children already have a group, rather than assuming that groups get
>> allocated in hierarchical order, but that's far from ideal.
> 
> Agree.
> 
> My own workaround was to hack the mentioned iort code to defer the PF 
> probe if the parent port had also yet to probe.
> 
>>
>> The underlying issue is that, for historical reasons, OF/IORT-based
>> IOMMU drivers have ended up with group allocation being tied to endpoint
>> driver probing via the dma_configure() mechanism (long story short,
>> driver probe is the only thing which can be delayed in order to wait for
>> a specific IOMMU instance to be ready).However, in the meantime, the
>> IOMMU API internals have evolved sufficiently that I think there's a way
>> to really put things right - I have the spark of an idea which I'll try
>> to sketch out ASAP...
>>
> 
> OK, great.
> 
> Thanks,
> John
> 
>> Robin.
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-19  8:43 arm64 iommu groups issue John Garry
2019-09-19 13:25 ` Robin Murphy
2019-09-19 14:35   ` John Garry
2019-11-04 12:18     ` John Garry

IOMMU Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-iommu/0 linux-iommu/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-iommu linux-iommu/ https://lore.kernel.org/linux-iommu \
		iommu@lists.linux-foundation.org
	public-inbox-index linux-iommu

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.linux-foundation.lists.iommu


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git