* [PATCH 0/2] iommu/arm-smmu: Fix races in iommu domain/group creation
@ 2021-06-10 4:16 Ashish Mhetre
2021-06-10 4:16 ` [PATCH 1/2] iommu: Fix race condition during default domain allocation Ashish Mhetre
2021-06-10 4:16 ` [PATCH 2/2] iommu/arm-smmu: Fix race condition during iommu_group creation Ashish Mhetre
0 siblings, 2 replies; 8+ messages in thread
From: Ashish Mhetre @ 2021-06-10 4:16 UTC (permalink / raw)
To: amhetre, joro, will, robin.murphy, vdumpa
Cc: iommu, linux-kernel, linux-arm-kernel
Fix races during iommu group/domain creation for devices sharing same SID.
Ashish Mhetre (1):
iommu: Fix race condition during default domain allocation
Krishna Reddy (1):
iommu/arm-smmu: Fix race condition during iommu_group creation
drivers/iommu/arm/arm-smmu/arm-smmu.c | 6 +++++-
drivers/iommu/iommu.c | 2 ++
2 files changed, 7 insertions(+), 1 deletion(-)
--
2.7.4
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] iommu: Fix race condition during default domain allocation
2021-06-10 4:16 [PATCH 0/2] iommu/arm-smmu: Fix races in iommu domain/group creation Ashish Mhetre
@ 2021-06-10 4:16 ` Ashish Mhetre
2021-06-11 10:45 ` Will Deacon
2021-06-10 4:16 ` [PATCH 2/2] iommu/arm-smmu: Fix race condition during iommu_group creation Ashish Mhetre
1 sibling, 1 reply; 8+ messages in thread
From: Ashish Mhetre @ 2021-06-10 4:16 UTC (permalink / raw)
To: amhetre, joro, will, robin.murphy, vdumpa
Cc: iommu, linux-kernel, linux-arm-kernel
Domain is getting created more than once during asynchronous multiple
display heads(devices) probe. All the display heads share same SID and
are expected to be in same domain. As iommu_alloc_default_domain() call
is not protected, the group->default_domain and group->domain are ending
up with different domains and leading to subsequent IOMMU faults.
Fix this by protecting iommu_alloc_default_domain() call with group->mutex.
Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
---
drivers/iommu/iommu.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 808ab70..2700500 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -273,7 +273,9 @@ int iommu_probe_device(struct device *dev)
* support default domains, so the return value is not yet
* checked.
*/
+ mutex_lock(&group->mutex);
iommu_alloc_default_domain(group, dev);
+ mutex_unlock(&group->mutex);
if (group->default_domain) {
ret = __iommu_attach_device(group->default_domain, dev);
--
2.7.4
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/2] iommu/arm-smmu: Fix race condition during iommu_group creation
2021-06-10 4:16 [PATCH 0/2] iommu/arm-smmu: Fix races in iommu domain/group creation Ashish Mhetre
2021-06-10 4:16 ` [PATCH 1/2] iommu: Fix race condition during default domain allocation Ashish Mhetre
@ 2021-06-10 4:16 ` Ashish Mhetre
1 sibling, 0 replies; 8+ messages in thread
From: Ashish Mhetre @ 2021-06-10 4:16 UTC (permalink / raw)
To: amhetre, joro, will, robin.murphy, vdumpa
Cc: iommu, linux-kernel, linux-arm-kernel
From: Krishna Reddy <vdumpa@nvidia.com>
iommu_group is getting created more than once during asynchronous multiple
display heads(devices) probe on Tegra194 SoC. All the display heads share
same SID and are expected to be in same iommu_group.
As arm_smmu_device_group() is not protecting group creation across devices,
it is leading to multiple groups creation across devices with same SID and
subsequent IOMMU faults.
Fix this by protecting group creation with smmu->stream_map_mutex.
Signed-off-by: Krishna Reddy <vdumpa@nvidia.com>
---
drivers/iommu/arm/arm-smmu/arm-smmu.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 6f72c4d..21af179 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1458,6 +1458,7 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
struct iommu_group *group = NULL;
int i, idx;
+ mutex_lock(&smmu->stream_map_mutex);
for_each_cfg_sme(cfg, fwspec, i, idx) {
if (group && smmu->s2crs[idx].group &&
group != smmu->s2crs[idx].group)
@@ -1466,8 +1467,10 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
group = smmu->s2crs[idx].group;
}
- if (group)
+ if (group) {
+ mutex_unlock(&smmu->stream_map_mutex);
return iommu_group_ref_get(group);
+ }
if (dev_is_pci(dev))
group = pci_device_group(dev);
@@ -1481,6 +1484,7 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
for_each_cfg_sme(cfg, fwspec, i, idx)
smmu->s2crs[idx].group = group;
+ mutex_unlock(&smmu->stream_map_mutex);
return group;
}
--
2.7.4
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] iommu: Fix race condition during default domain allocation
2021-06-10 4:16 ` [PATCH 1/2] iommu: Fix race condition during default domain allocation Ashish Mhetre
@ 2021-06-11 10:45 ` Will Deacon
2021-06-11 12:49 ` Robin Murphy
2021-06-11 18:30 ` Krishna Reddy
0 siblings, 2 replies; 8+ messages in thread
From: Will Deacon @ 2021-06-11 10:45 UTC (permalink / raw)
To: Ashish Mhetre; +Cc: joro, iommu, linux-kernel, robin.murphy, linux-arm-kernel
On Thu, Jun 10, 2021 at 09:46:53AM +0530, Ashish Mhetre wrote:
> Domain is getting created more than once during asynchronous multiple
> display heads(devices) probe. All the display heads share same SID and
> are expected to be in same domain. As iommu_alloc_default_domain() call
> is not protected, the group->default_domain and group->domain are ending
> up with different domains and leading to subsequent IOMMU faults.
> Fix this by protecting iommu_alloc_default_domain() call with group->mutex.
Can you provide some more information about exactly what the h/w
configuration is, and the callstack which exhibits the race, please?
> Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
> ---
> drivers/iommu/iommu.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 808ab70..2700500 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -273,7 +273,9 @@ int iommu_probe_device(struct device *dev)
> * support default domains, so the return value is not yet
> * checked.
> */
> + mutex_lock(&group->mutex);
> iommu_alloc_default_domain(group, dev);
> + mutex_unlock(&group->mutex);
It feels wrong to serialise this for everybody just to cater for systems
with aliasing SIDs between devices.
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] iommu: Fix race condition during default domain allocation
2021-06-11 10:45 ` Will Deacon
@ 2021-06-11 12:49 ` Robin Murphy
2021-06-17 5:51 ` Ashish Mhetre
2021-06-11 18:30 ` Krishna Reddy
1 sibling, 1 reply; 8+ messages in thread
From: Robin Murphy @ 2021-06-11 12:49 UTC (permalink / raw)
To: Will Deacon, Ashish Mhetre; +Cc: iommu, linux-kernel, linux-arm-kernel
On 2021-06-11 11:45, Will Deacon wrote:
> On Thu, Jun 10, 2021 at 09:46:53AM +0530, Ashish Mhetre wrote:
>> Domain is getting created more than once during asynchronous multiple
>> display heads(devices) probe. All the display heads share same SID and
>> are expected to be in same domain. As iommu_alloc_default_domain() call
>> is not protected, the group->default_domain and group->domain are ending
>> up with different domains and leading to subsequent IOMMU faults.
>> Fix this by protecting iommu_alloc_default_domain() call with group->mutex.
>
> Can you provide some more information about exactly what the h/w
> configuration is, and the callstack which exhibits the race, please?
It'll be basically the same as the issue reported long ago with PCI
groups in the absence of ACS not being constructed correctly. Triggering
the iommu_probe_device() replay in of_iommu_configure() off the back of
driver probe is way too late and allows calls to happen in the wrong
order, or indeed race in parallel as here. Fixing that is still on my
radar, but will not be simple, and will probably go hand-in-hand with
phasing out the bus ops (for the multiple-driver-coexistence problem).
>> Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
>> ---
>> drivers/iommu/iommu.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 808ab70..2700500 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -273,7 +273,9 @@ int iommu_probe_device(struct device *dev)
>> * support default domains, so the return value is not yet
>> * checked.
>> */
>> + mutex_lock(&group->mutex);
>> iommu_alloc_default_domain(group, dev);
>> + mutex_unlock(&group->mutex);
>
> It feels wrong to serialise this for everybody just to cater for systems
> with aliasing SIDs between devices.
If two or more devices are racing at this point then they're already
going to be serialised by at least iommu_group_add_device(), so I doubt
there would be much impact - only the first device through here will
hold the mutex for any appreciable length of time. Every other path
which modifies group->domain does so with the mutex held (note the
"expected" default domain allocation flow in bus_iommu_probe() in
particular), so not holding it here does seem like a straightforward
oversight.
Robin.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [PATCH 1/2] iommu: Fix race condition during default domain allocation
2021-06-11 10:45 ` Will Deacon
2021-06-11 12:49 ` Robin Murphy
@ 2021-06-11 18:30 ` Krishna Reddy
1 sibling, 0 replies; 8+ messages in thread
From: Krishna Reddy @ 2021-06-11 18:30 UTC (permalink / raw)
To: Will Deacon, Ashish Mhetre
Cc: joro, robin.murphy, iommu, linux-kernel, linux-arm-kernel
> > + mutex_lock(&group->mutex);
> > iommu_alloc_default_domain(group, dev);
> > + mutex_unlock(&group->mutex);
>
> It feels wrong to serialise this for everybody just to cater for systems with
> aliasing SIDs between devices.
Serialization is limited to devices in the same group. Unless devices share SID, they wouldn't be in same group.
> Can you provide some more information about exactly what the h/w
> configuration is, and the callstack which exhibits the race, please?
The failure is an after effect and is a page fault. Don't have a failure call stack here. Ashish has traced it through print messages and he can provide them.
From the prints messages, The following was observed in page fault case:
Device1: iommu_probe_device() --> iommu_alloc_default_domain() --> iommu_group_alloc_default_domain() --> __iommu_attach_device(group->default_domain)
Device2: iommu_probe_device() --> iommu_alloc_default_domain() --> iommu_group_alloc_default_domain() --> __iommu_attach_device(group->default_domain)
Both devices(with same SID) are entering into iommu_group_alloc_default_domain() function and each one getting attached to a different group->default_domain
as the second one overwrites group->default_domain after the first one attaches to group->default_domain it has created.
SMMU would be setup to use first domain for the context page table. Whereas all the dma map/unamp requests from second device would
be performed on a domain that is not used by SMMU for context translations and IOVA (not mapped in first domain) accesses from second device lead to page faults.
-KR
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] iommu: Fix race condition during default domain allocation
2021-06-11 12:49 ` Robin Murphy
@ 2021-06-17 5:51 ` Ashish Mhetre
2021-06-17 17:49 ` Will Deacon
0 siblings, 1 reply; 8+ messages in thread
From: Ashish Mhetre @ 2021-06-17 5:51 UTC (permalink / raw)
To: Robin Murphy, Will Deacon, vdumpa; +Cc: iommu, linux-kernel, linux-arm-kernel
On 6/11/2021 6:19 PM, Robin Murphy wrote:
> External email: Use caution opening links or attachments
>
>
> On 2021-06-11 11:45, Will Deacon wrote:
>> On Thu, Jun 10, 2021 at 09:46:53AM +0530, Ashish Mhetre wrote:
>>> Domain is getting created more than once during asynchronous multiple
>>> display heads(devices) probe. All the display heads share same SID and
>>> are expected to be in same domain. As iommu_alloc_default_domain() call
>>> is not protected, the group->default_domain and group->domain are ending
>>> up with different domains and leading to subsequent IOMMU faults.
>>> Fix this by protecting iommu_alloc_default_domain() call with
>>> group->mutex.
>>
>> Can you provide some more information about exactly what the h/w
>> configuration is, and the callstack which exhibits the race, please?
>
> It'll be basically the same as the issue reported long ago with PCI
> groups in the absence of ACS not being constructed correctly. Triggering
> the iommu_probe_device() replay in of_iommu_configure() off the back of
> driver probe is way too late and allows calls to happen in the wrong
> order, or indeed race in parallel as here. Fixing that is still on my
> radar, but will not be simple, and will probably go hand-in-hand with
> phasing out the bus ops (for the multiple-driver-coexistence problem).
>
For iommu group creation, the stack flow during race is like:
Display device 1:
iommu_probe_device -> iommu_group_get_for_dev -> arm_smmu_device_group
Display device 2:
iommu_probe_device -> iommu_group_get_for_dev -> arm_smmu_device_group
And this way it ends up in creating 2 groups for 2 display devices
sharing same SID.
Ideally for 2nd display device, iommu_group_get call from
iommu_group_get_for_dev should return same group as 1st display device.
But due to the race, it ends up with 2 groups.
For default domain, the stack flow during race is like:
Display device 1:
iommu_probe_device -> iommu_alloc_default_domain -> arm_smmu_domain_alloc
Display device 2:
iommu_probe_device -> iommu_alloc_default_domain -> arm_smmu_domain_alloc
Here also 2nd device should already have domain allocated and
'if(group->default_domain)' condition from iommu_alloc_default_domain
should be true for 2nd device.
Issue with this is IOVA accesses from 2nd device results in context faults.
>>> Signed-off-by: Ashish Mhetre <amhetre@nvidia.com>
>>> ---
>>> drivers/iommu/iommu.c | 2 ++
>>> 1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>>> index 808ab70..2700500 100644
>>> --- a/drivers/iommu/iommu.c
>>> +++ b/drivers/iommu/iommu.c
>>> @@ -273,7 +273,9 @@ int iommu_probe_device(struct device *dev)
>>> * support default domains, so the return value is not yet
>>> * checked.
>>> */
>>> + mutex_lock(&group->mutex);
>>> iommu_alloc_default_domain(group, dev);
>>> + mutex_unlock(&group->mutex);
>>
>> It feels wrong to serialise this for everybody just to cater for systems
>> with aliasing SIDs between devices.
>
> If two or more devices are racing at this point then they're already
> going to be serialised by at least iommu_group_add_device(), so I doubt
> there would be much impact - only the first device through here will
> hold the mutex for any appreciable length of time. Every other path
> which modifies group->domain does so with the mutex held (note the
> "expected" default domain allocation flow in bus_iommu_probe() in
> particular), so not holding it here does seem like a straightforward
> oversight.
>
> Robin.
Serialization will only happen for the devices sharing same group. Only
the first device in group will hold this till domain is created. For
rest of the devices it will just check for existing domain in
iommu_alloc_default_domain and then return and release the mutex.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] iommu: Fix race condition during default domain allocation
2021-06-17 5:51 ` Ashish Mhetre
@ 2021-06-17 17:49 ` Will Deacon
0 siblings, 0 replies; 8+ messages in thread
From: Will Deacon @ 2021-06-17 17:49 UTC (permalink / raw)
To: Ashish Mhetre; +Cc: Robin Murphy, iommu, linux-arm-kernel, linux-kernel
On Thu, Jun 17, 2021 at 11:21:39AM +0530, Ashish Mhetre wrote:
>
>
> On 6/11/2021 6:19 PM, Robin Murphy wrote:
> > External email: Use caution opening links or attachments
> >
> >
> > On 2021-06-11 11:45, Will Deacon wrote:
> > > On Thu, Jun 10, 2021 at 09:46:53AM +0530, Ashish Mhetre wrote:
> > > > Domain is getting created more than once during asynchronous multiple
> > > > display heads(devices) probe. All the display heads share same SID and
> > > > are expected to be in same domain. As iommu_alloc_default_domain() call
> > > > is not protected, the group->default_domain and group->domain are ending
> > > > up with different domains and leading to subsequent IOMMU faults.
> > > > Fix this by protecting iommu_alloc_default_domain() call with
> > > > group->mutex.
> > >
> > > Can you provide some more information about exactly what the h/w
> > > configuration is, and the callstack which exhibits the race, please?
> >
> > It'll be basically the same as the issue reported long ago with PCI
> > groups in the absence of ACS not being constructed correctly. Triggering
> > the iommu_probe_device() replay in of_iommu_configure() off the back of
> > driver probe is way too late and allows calls to happen in the wrong
> > order, or indeed race in parallel as here. Fixing that is still on my
> > radar, but will not be simple, and will probably go hand-in-hand with
> > phasing out the bus ops (for the multiple-driver-coexistence problem).
> >
> For iommu group creation, the stack flow during race is like:
> Display device 1:
> iommu_probe_device -> iommu_group_get_for_dev -> arm_smmu_device_group
> Display device 2:
> iommu_probe_device -> iommu_group_get_for_dev -> arm_smmu_device_group
>
> And this way it ends up in creating 2 groups for 2 display devices sharing
> same SID.
> Ideally for 2nd display device, iommu_group_get call from
> iommu_group_get_for_dev should return same group as 1st display device. But
> due to the race, it ends up with 2 groups.
>
> For default domain, the stack flow during race is like:
> Display device 1:
> iommu_probe_device -> iommu_alloc_default_domain -> arm_smmu_domain_alloc
> Display device 2:
> iommu_probe_device -> iommu_alloc_default_domain -> arm_smmu_domain_alloc
>
> Here also 2nd device should already have domain allocated and
> 'if(group->default_domain)' condition from iommu_alloc_default_domain should
> be true for 2nd device.
>
> Issue with this is IOVA accesses from 2nd device results in context faults.
Thanks for the explanation (also Robin and Krishna). Please put some of this
in the commit message for the next version.
Will
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-06-17 17:50 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-10 4:16 [PATCH 0/2] iommu/arm-smmu: Fix races in iommu domain/group creation Ashish Mhetre
2021-06-10 4:16 ` [PATCH 1/2] iommu: Fix race condition during default domain allocation Ashish Mhetre
2021-06-11 10:45 ` Will Deacon
2021-06-11 12:49 ` Robin Murphy
2021-06-17 5:51 ` Ashish Mhetre
2021-06-17 17:49 ` Will Deacon
2021-06-11 18:30 ` Krishna Reddy
2021-06-10 4:16 ` [PATCH 2/2] iommu/arm-smmu: Fix race condition during iommu_group creation Ashish Mhetre
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).