* [PATCH 0/4] Miscellaneous ARM SMMU patches @ 2016-01-26 18:06 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-01-26 18:06 UTC (permalink / raw) To: will.deacon-5wv7dgnIgG8 Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, tchalamarla-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8 Hi Will, Here's my current "miscellaneous SMMU enhancements" branch for your consideration. Patch #1 solves a subtle corner case that cropped up while exercising stage 1 context formats on the DMA330-MMU500 model; #3 will be wanted fairly soon for DMA ops integration so may as well get some exposure now; #2 and #4 are more nice-to-haves. Thanks, Robin. Robin Murphy (4): iommu/arm-smmu: Treat all device transactions as unprivileged iommu/arm-smmu: Allow disabling unmatched stream bypass iommu/arm-smmu: Support DMA-API domains iommu/arm-smmu: Use per-context TLB sync as appropriate drivers/iommu/arm-smmu-v3.c | 10 +++++- drivers/iommu/arm-smmu.c | 79 ++++++++++++++++++++++++++++++++++----------- 2 files changed, 69 insertions(+), 20 deletions(-) -- 2.7.0.25.gfc10eb5.dirty ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 0/4] Miscellaneous ARM SMMU patches @ 2016-01-26 18:06 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-01-26 18:06 UTC (permalink / raw) To: linux-arm-kernel Hi Will, Here's my current "miscellaneous SMMU enhancements" branch for your consideration. Patch #1 solves a subtle corner case that cropped up while exercising stage 1 context formats on the DMA330-MMU500 model; #3 will be wanted fairly soon for DMA ops integration so may as well get some exposure now; #2 and #4 are more nice-to-haves. Thanks, Robin. Robin Murphy (4): iommu/arm-smmu: Treat all device transactions as unprivileged iommu/arm-smmu: Allow disabling unmatched stream bypass iommu/arm-smmu: Support DMA-API domains iommu/arm-smmu: Use per-context TLB sync as appropriate drivers/iommu/arm-smmu-v3.c | 10 +++++- drivers/iommu/arm-smmu.c | 79 ++++++++++++++++++++++++++++++++++----------- 2 files changed, 69 insertions(+), 20 deletions(-) -- 2.7.0.25.gfc10eb5.dirty ^ permalink raw reply [flat|nested] 32+ messages in thread
[parent not found: <cover.1453830752.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>]
* [PATCH 1/4] iommu/arm-smmu: Treat all device transactions as unprivileged 2016-01-26 18:06 ` Robin Murphy @ 2016-01-26 18:06 ` Robin Murphy -1 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-01-26 18:06 UTC (permalink / raw) To: will.deacon-5wv7dgnIgG8 Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, tchalamarla-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8 The IOMMU API has no concept of privilege so assumes all devices and mappings are equal, and indeed most non-CPU master devices on an AMBA interconnect make little use of the attribute bits on the bus thus by default perform unprivileged data accesses. Some devices, however, believe themselves more equal than others, such as programmable DMA controllers whose 'master' thread issues bus transactions marked as privileged instruction fetches, while the data accesses of its channel threads (under the control of Linux, at least) are marked as unprivileged. This poses a problem for implementing the DMA API on an IOMMU conforming to ARM VMSAv8, under which a page that is unprivileged-writeable is also implicitly privileged-execute-never. Given that, there is no one set of attributes with which iommu_map() can implement, say, dma_alloc_coherent() that will allow every possible type of access without something running into unexecepted permission faults. Fortunately the SMMU architecture provides a means to mitigate such issues by overriding the incoming attributes of a transaction; make use of that to strip the privileged/unprivileged status off incoming transactions, leaving just the instruction/data dichotomy which the IOMMU API does at least understand; Four states good, two states better. Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> --- drivers/iommu/arm-smmu.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 59ee4b8..1f9093d 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -167,6 +167,9 @@ #define S2CR_TYPE_BYPASS (1 << S2CR_TYPE_SHIFT) #define S2CR_TYPE_FAULT (2 << S2CR_TYPE_SHIFT) +#define S2CR_PRIVCFG_SHIFT 24 +#define S2CR_PRIVCFG_UNPRIV (2 << S2CR_PRIVCFG_SHIFT) + /* Context bank attribute registers */ #define ARM_SMMU_GR1_CBAR(n) (0x0 + ((n) << 2)) #define CBAR_VMID_SHIFT 0 @@ -1083,7 +1086,7 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain, u32 idx, s2cr; idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; - s2cr = S2CR_TYPE_TRANS | + s2cr = S2CR_TYPE_TRANS | S2CR_PRIVCFG_UNPRIV | (smmu_domain->cfg.cbndx << S2CR_CBNDX_SHIFT); writel_relaxed(s2cr, gr0_base + ARM_SMMU_GR0_S2CR(idx)); } -- 2.7.0.25.gfc10eb5.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 1/4] iommu/arm-smmu: Treat all device transactions as unprivileged @ 2016-01-26 18:06 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-01-26 18:06 UTC (permalink / raw) To: linux-arm-kernel The IOMMU API has no concept of privilege so assumes all devices and mappings are equal, and indeed most non-CPU master devices on an AMBA interconnect make little use of the attribute bits on the bus thus by default perform unprivileged data accesses. Some devices, however, believe themselves more equal than others, such as programmable DMA controllers whose 'master' thread issues bus transactions marked as privileged instruction fetches, while the data accesses of its channel threads (under the control of Linux, at least) are marked as unprivileged. This poses a problem for implementing the DMA API on an IOMMU conforming to ARM VMSAv8, under which a page that is unprivileged-writeable is also implicitly privileged-execute-never. Given that, there is no one set of attributes with which iommu_map() can implement, say, dma_alloc_coherent() that will allow every possible type of access without something running into unexecepted permission faults. Fortunately the SMMU architecture provides a means to mitigate such issues by overriding the incoming attributes of a transaction; make use of that to strip the privileged/unprivileged status off incoming transactions, leaving just the instruction/data dichotomy which the IOMMU API does at least understand; Four states good, two states better. Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- drivers/iommu/arm-smmu.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 59ee4b8..1f9093d 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -167,6 +167,9 @@ #define S2CR_TYPE_BYPASS (1 << S2CR_TYPE_SHIFT) #define S2CR_TYPE_FAULT (2 << S2CR_TYPE_SHIFT) +#define S2CR_PRIVCFG_SHIFT 24 +#define S2CR_PRIVCFG_UNPRIV (2 << S2CR_PRIVCFG_SHIFT) + /* Context bank attribute registers */ #define ARM_SMMU_GR1_CBAR(n) (0x0 + ((n) << 2)) #define CBAR_VMID_SHIFT 0 @@ -1083,7 +1086,7 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain, u32 idx, s2cr; idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; - s2cr = S2CR_TYPE_TRANS | + s2cr = S2CR_TYPE_TRANS | S2CR_PRIVCFG_UNPRIV | (smmu_domain->cfg.cbndx << S2CR_CBNDX_SHIFT); writel_relaxed(s2cr, gr0_base + ARM_SMMU_GR0_S2CR(idx)); } -- 2.7.0.25.gfc10eb5.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
[parent not found: <6c5730256333b8d941f2c0371c1ab709a454938c.1453830752.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>]
* RE: [PATCH 1/4] iommu/arm-smmu: Treat all device transactions as unprivileged 2016-01-26 18:06 ` Robin Murphy @ 2016-01-27 6:00 ` Anup Patel -1 siblings, 0 replies; 32+ messages in thread From: Anup Patel @ 2016-01-27 6:00 UTC (permalink / raw) To: Robin Murphy, will.deacon-5wv7dgnIgG8 Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, bcm-kernel-feedback-list, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, tchalamarla-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8 Hi Robin, > -----Original Message----- > From: iommu-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org [mailto:iommu- > bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org] On Behalf Of Robin Murphy > Sent: 26 January 2016 23:37 > To: will.deacon-5wv7dgnIgG8@public.gmane.org > Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org; linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org; > tchalamarla-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8@public.gmane.org > Subject: [PATCH 1/4] iommu/arm-smmu: Treat all device transactions as > unprivileged > > The IOMMU API has no concept of privilege so assumes all devices and > mappings are equal, and indeed most non-CPU master devices on an AMBA > interconnect make little use of the attribute bits on the bus thus by default > perform unprivileged data accesses. > > Some devices, however, believe themselves more equal than others, such as > programmable DMA controllers whose 'master' thread issues bus transactions > marked as privileged instruction fetches, while the data accesses of its channel > threads (under the control of Linux, at least) are marked as unprivileged. This > poses a problem for implementing the DMA API on an IOMMU conforming to > ARM VMSAv8, under which a page that is unprivileged-writeable is also implicitly > privileged-execute-never. > Given that, there is no one set of attributes with which iommu_map() can > implement, say, dma_alloc_coherent() that will allow every possible type of > access without something running into unexecepted permission faults. > > Fortunately the SMMU architecture provides a means to mitigate such issues by > overriding the incoming attributes of a transaction; make use of that to strip the > privileged/unprivileged status off incoming transactions, leaving just the > instruction/data dichotomy which the IOMMU API does at least understand; > Four states good, two states better. > > Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> > --- > drivers/iommu/arm-smmu.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index > 59ee4b8..1f9093d 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -167,6 +167,9 @@ > #define S2CR_TYPE_BYPASS (1 << S2CR_TYPE_SHIFT) > #define S2CR_TYPE_FAULT (2 << S2CR_TYPE_SHIFT) > > +#define S2CR_PRIVCFG_SHIFT 24 > +#define S2CR_PRIVCFG_UNPRIV (2 << S2CR_PRIVCFG_SHIFT) > + > /* Context bank attribute registers */ > #define ARM_SMMU_GR1_CBAR(n) (0x0 + ((n) << 2)) > #define CBAR_VMID_SHIFT 0 > @@ -1083,7 +1086,7 @@ static int arm_smmu_domain_add_master(struct > arm_smmu_domain *smmu_domain, > u32 idx, s2cr; > > idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; > - s2cr = S2CR_TYPE_TRANS | > + s2cr = S2CR_TYPE_TRANS | S2CR_PRIVCFG_UNPRIV | > (smmu_domain->cfg.cbndx << S2CR_CBNDX_SHIFT); Treating all MMU master access as unprivileged sounds more conservative. Alternate approach would be to treat instruction fetches as data reads. Regards, Anup ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/4] iommu/arm-smmu: Treat all device transactions as unprivileged @ 2016-01-27 6:00 ` Anup Patel 0 siblings, 0 replies; 32+ messages in thread From: Anup Patel @ 2016-01-27 6:00 UTC (permalink / raw) To: linux-arm-kernel Hi Robin, > -----Original Message----- > From: iommu-bounces at lists.linux-foundation.org [mailto:iommu- > bounces at lists.linux-foundation.org] On Behalf Of Robin Murphy > Sent: 26 January 2016 23:37 > To: will.deacon at arm.com > Cc: iommu at lists.linux-foundation.org; linux-arm-kernel at lists.infradead.org; > tchalamarla at caviumnetworks.com > Subject: [PATCH 1/4] iommu/arm-smmu: Treat all device transactions as > unprivileged > > The IOMMU API has no concept of privilege so assumes all devices and > mappings are equal, and indeed most non-CPU master devices on an AMBA > interconnect make little use of the attribute bits on the bus thus by default > perform unprivileged data accesses. > > Some devices, however, believe themselves more equal than others, such as > programmable DMA controllers whose 'master' thread issues bus transactions > marked as privileged instruction fetches, while the data accesses of its channel > threads (under the control of Linux, at least) are marked as unprivileged. This > poses a problem for implementing the DMA API on an IOMMU conforming to > ARM VMSAv8, under which a page that is unprivileged-writeable is also implicitly > privileged-execute-never. > Given that, there is no one set of attributes with which iommu_map() can > implement, say, dma_alloc_coherent() that will allow every possible type of > access without something running into unexecepted permission faults. > > Fortunately the SMMU architecture provides a means to mitigate such issues by > overriding the incoming attributes of a transaction; make use of that to strip the > privileged/unprivileged status off incoming transactions, leaving just the > instruction/data dichotomy which the IOMMU API does at least understand; > Four states good, two states better. > > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > --- > drivers/iommu/arm-smmu.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index > 59ee4b8..1f9093d 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -167,6 +167,9 @@ > #define S2CR_TYPE_BYPASS (1 << S2CR_TYPE_SHIFT) > #define S2CR_TYPE_FAULT (2 << S2CR_TYPE_SHIFT) > > +#define S2CR_PRIVCFG_SHIFT 24 > +#define S2CR_PRIVCFG_UNPRIV (2 << S2CR_PRIVCFG_SHIFT) > + > /* Context bank attribute registers */ > #define ARM_SMMU_GR1_CBAR(n) (0x0 + ((n) << 2)) > #define CBAR_VMID_SHIFT 0 > @@ -1083,7 +1086,7 @@ static int arm_smmu_domain_add_master(struct > arm_smmu_domain *smmu_domain, > u32 idx, s2cr; > > idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; > - s2cr = S2CR_TYPE_TRANS | > + s2cr = S2CR_TYPE_TRANS | S2CR_PRIVCFG_UNPRIV | > (smmu_domain->cfg.cbndx << S2CR_CBNDX_SHIFT); Treating all MMU master access as unprivileged sounds more conservative. Alternate approach would be to treat instruction fetches as data reads. Regards, Anup ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/4] iommu/arm-smmu: Treat all device transactions as unprivileged 2016-01-26 18:06 ` Robin Murphy @ 2016-02-09 14:08 ` Will Deacon -1 siblings, 0 replies; 32+ messages in thread From: Will Deacon @ 2016-02-09 14:08 UTC (permalink / raw) To: Robin Murphy Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, tchalamarla-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8 On Tue, Jan 26, 2016 at 06:06:34PM +0000, Robin Murphy wrote: > The IOMMU API has no concept of privilege so assumes all devices and > mappings are equal, and indeed most non-CPU master devices on an AMBA > interconnect make little use of the attribute bits on the bus thus by > default perform unprivileged data accesses. > > Some devices, however, believe themselves more equal than others, such > as programmable DMA controllers whose 'master' thread issues bus > transactions marked as privileged instruction fetches, while the data > accesses of its channel threads (under the control of Linux, at least) > are marked as unprivileged. This poses a problem for implementing the > DMA API on an IOMMU conforming to ARM VMSAv8, under which a page that is > unprivileged-writeable is also implicitly privileged-execute-never. > Given that, there is no one set of attributes with which iommu_map() can > implement, say, dma_alloc_coherent() that will allow every possible type > of access without something running into unexecepted permission faults. > > Fortunately the SMMU architecture provides a means to mitigate such > issues by overriding the incoming attributes of a transaction; make use > of that to strip the privileged/unprivileged status off incoming > transactions, leaving just the instruction/data dichotomy which the > IOMMU API does at least understand; Four states good, two states better. > > Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> > --- > drivers/iommu/arm-smmu.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 59ee4b8..1f9093d 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -167,6 +167,9 @@ > #define S2CR_TYPE_BYPASS (1 << S2CR_TYPE_SHIFT) > #define S2CR_TYPE_FAULT (2 << S2CR_TYPE_SHIFT) > > +#define S2CR_PRIVCFG_SHIFT 24 > +#define S2CR_PRIVCFG_UNPRIV (2 << S2CR_PRIVCFG_SHIFT) > + > /* Context bank attribute registers */ > #define ARM_SMMU_GR1_CBAR(n) (0x0 + ((n) << 2)) > #define CBAR_VMID_SHIFT 0 > @@ -1083,7 +1086,7 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain, > u32 idx, s2cr; > > idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; > - s2cr = S2CR_TYPE_TRANS | > + s2cr = S2CR_TYPE_TRANS | S2CR_PRIVCFG_UNPRIV | > (smmu_domain->cfg.cbndx << S2CR_CBNDX_SHIFT); > writel_relaxed(s2cr, gr0_base + ARM_SMMU_GR0_S2CR(idx)); Hmm, do we also need to worry about the bypass case? I guess not for the moment, but I anticipate horrible things. Will ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/4] iommu/arm-smmu: Treat all device transactions as unprivileged @ 2016-02-09 14:08 ` Will Deacon 0 siblings, 0 replies; 32+ messages in thread From: Will Deacon @ 2016-02-09 14:08 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jan 26, 2016 at 06:06:34PM +0000, Robin Murphy wrote: > The IOMMU API has no concept of privilege so assumes all devices and > mappings are equal, and indeed most non-CPU master devices on an AMBA > interconnect make little use of the attribute bits on the bus thus by > default perform unprivileged data accesses. > > Some devices, however, believe themselves more equal than others, such > as programmable DMA controllers whose 'master' thread issues bus > transactions marked as privileged instruction fetches, while the data > accesses of its channel threads (under the control of Linux, at least) > are marked as unprivileged. This poses a problem for implementing the > DMA API on an IOMMU conforming to ARM VMSAv8, under which a page that is > unprivileged-writeable is also implicitly privileged-execute-never. > Given that, there is no one set of attributes with which iommu_map() can > implement, say, dma_alloc_coherent() that will allow every possible type > of access without something running into unexecepted permission faults. > > Fortunately the SMMU architecture provides a means to mitigate such > issues by overriding the incoming attributes of a transaction; make use > of that to strip the privileged/unprivileged status off incoming > transactions, leaving just the instruction/data dichotomy which the > IOMMU API does at least understand; Four states good, two states better. > > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > --- > drivers/iommu/arm-smmu.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 59ee4b8..1f9093d 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -167,6 +167,9 @@ > #define S2CR_TYPE_BYPASS (1 << S2CR_TYPE_SHIFT) > #define S2CR_TYPE_FAULT (2 << S2CR_TYPE_SHIFT) > > +#define S2CR_PRIVCFG_SHIFT 24 > +#define S2CR_PRIVCFG_UNPRIV (2 << S2CR_PRIVCFG_SHIFT) > + > /* Context bank attribute registers */ > #define ARM_SMMU_GR1_CBAR(n) (0x0 + ((n) << 2)) > #define CBAR_VMID_SHIFT 0 > @@ -1083,7 +1086,7 @@ static int arm_smmu_domain_add_master(struct arm_smmu_domain *smmu_domain, > u32 idx, s2cr; > > idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; > - s2cr = S2CR_TYPE_TRANS | > + s2cr = S2CR_TYPE_TRANS | S2CR_PRIVCFG_UNPRIV | > (smmu_domain->cfg.cbndx << S2CR_CBNDX_SHIFT); > writel_relaxed(s2cr, gr0_base + ARM_SMMU_GR0_S2CR(idx)); Hmm, do we also need to worry about the bypass case? I guess not for the moment, but I anticipate horrible things. Will ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 2/4] iommu/arm-smmu: Allow disabling unmatched stream bypass 2016-01-26 18:06 ` Robin Murphy @ 2016-01-26 18:06 ` Robin Murphy -1 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-01-26 18:06 UTC (permalink / raw) To: will.deacon-5wv7dgnIgG8 Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, tchalamarla-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8 Borrow the disable_bypass parameter from the SMMUv3 driver as a handy debugging/security feature so that unmatched stream IDs (i.e. devices not attached to an IOMMU domain) may be configured to fault. Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> --- drivers/iommu/arm-smmu.c | 32 +++++++++++++++++++++++--------- 1 file changed, 23 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 1f9093d..d1b7dc1 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -141,6 +141,8 @@ #define ID2_PTFS_16K (1 << 13) #define ID2_PTFS_64K (1 << 14) +#define sGFSR_USF (1 << 1) + /* Global TLB invalidation */ #define ARM_SMMU_GR0_TLBIVMID 0x64 #define ARM_SMMU_GR0_TLBIALLNSNH 0x68 @@ -263,6 +265,10 @@ static int force_stage; module_param_named(force_stage, force_stage, int, S_IRUGO); MODULE_PARM_DESC(force_stage, "Force SMMU mappings to be installed at a particular stage of translation. A value of '1' or '2' forces the corresponding stage. All other values are ignored (i.e. no stage is forced). Note that selecting a specific stage will disable support for nested translation."); +static bool disable_bypass; +module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO); +MODULE_PARM_DESC(disable_bypass, + "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU."); enum arm_smmu_arch_version { ARM_SMMU_V1 = 1, @@ -704,8 +710,12 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev) if (!gfsr) return IRQ_NONE; - dev_err_ratelimited(smmu->dev, - "Unexpected global fault, this could be serious\n"); + if (gfsr & sGFSR_USF) + dev_err_ratelimited(smmu->dev, "Unmatched stream fault: ID 0x%x\n", + gfsynr1 & SMR_ID_MASK); + else + dev_err_ratelimited(smmu->dev, + "Unexpected global fault, this could be serious\n"); dev_err_ratelimited(smmu->dev, "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 0x%08x\n", gfsr, gfsynr0, gfsynr1, gfsynr2); @@ -1111,9 +1121,9 @@ static void arm_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain, */ for (i = 0; i < cfg->num_streamids; ++i) { u32 idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; + u32 reg = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS; - writel_relaxed(S2CR_TYPE_BYPASS, - gr0_base + ARM_SMMU_GR0_S2CR(idx)); + writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_S2CR(idx)); } arm_smmu_master_free_smrs(smmu, cfg); @@ -1476,11 +1486,11 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) reg = readl_relaxed(ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sGFSR); writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sGFSR); - /* Mark all SMRn as invalid and all S2CRn as bypass */ + /* Mark all SMRn as invalid and all S2CRn as bypass unless overridden */ + reg = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS; for (i = 0; i < smmu->num_mapping_groups; ++i) { writel_relaxed(0, gr0_base + ARM_SMMU_GR0_SMR(i)); - writel_relaxed(S2CR_TYPE_BYPASS, - gr0_base + ARM_SMMU_GR0_S2CR(i)); + writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_S2CR(i)); } /* Make sure all context banks are disabled and clear CB_FSR */ @@ -1502,8 +1512,12 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) /* Disable TLB broadcasting. */ reg |= (sCR0_VMIDPNE | sCR0_PTM); - /* Enable client access, but bypass when no mapping is found */ - reg &= ~(sCR0_CLIENTPD | sCR0_USFCFG); + /* Enable client access, handling unmatched streams as appropriate */ + reg &= ~sCR0_CLIENTPD; + if (disable_bypass) + reg |= sCR0_USFCFG; + else + reg &= ~sCR0_USFCFG; /* Disable forced broadcasting */ reg &= ~sCR0_FB; -- 2.7.0.25.gfc10eb5.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/4] iommu/arm-smmu: Allow disabling unmatched stream bypass @ 2016-01-26 18:06 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-01-26 18:06 UTC (permalink / raw) To: linux-arm-kernel Borrow the disable_bypass parameter from the SMMUv3 driver as a handy debugging/security feature so that unmatched stream IDs (i.e. devices not attached to an IOMMU domain) may be configured to fault. Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- drivers/iommu/arm-smmu.c | 32 +++++++++++++++++++++++--------- 1 file changed, 23 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 1f9093d..d1b7dc1 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -141,6 +141,8 @@ #define ID2_PTFS_16K (1 << 13) #define ID2_PTFS_64K (1 << 14) +#define sGFSR_USF (1 << 1) + /* Global TLB invalidation */ #define ARM_SMMU_GR0_TLBIVMID 0x64 #define ARM_SMMU_GR0_TLBIALLNSNH 0x68 @@ -263,6 +265,10 @@ static int force_stage; module_param_named(force_stage, force_stage, int, S_IRUGO); MODULE_PARM_DESC(force_stage, "Force SMMU mappings to be installed@a particular stage of translation. A value of '1' or '2' forces the corresponding stage. All other values are ignored (i.e. no stage is forced). Note that selecting a specific stage will disable support for nested translation."); +static bool disable_bypass; +module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO); +MODULE_PARM_DESC(disable_bypass, + "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU."); enum arm_smmu_arch_version { ARM_SMMU_V1 = 1, @@ -704,8 +710,12 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev) if (!gfsr) return IRQ_NONE; - dev_err_ratelimited(smmu->dev, - "Unexpected global fault, this could be serious\n"); + if (gfsr & sGFSR_USF) + dev_err_ratelimited(smmu->dev, "Unmatched stream fault: ID 0x%x\n", + gfsynr1 & SMR_ID_MASK); + else + dev_err_ratelimited(smmu->dev, + "Unexpected global fault, this could be serious\n"); dev_err_ratelimited(smmu->dev, "\tGFSR 0x%08x, GFSYNR0 0x%08x, GFSYNR1 0x%08x, GFSYNR2 0x%08x\n", gfsr, gfsynr0, gfsynr1, gfsynr2); @@ -1111,9 +1121,9 @@ static void arm_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain, */ for (i = 0; i < cfg->num_streamids; ++i) { u32 idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; + u32 reg = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS; - writel_relaxed(S2CR_TYPE_BYPASS, - gr0_base + ARM_SMMU_GR0_S2CR(idx)); + writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_S2CR(idx)); } arm_smmu_master_free_smrs(smmu, cfg); @@ -1476,11 +1486,11 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) reg = readl_relaxed(ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sGFSR); writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sGFSR); - /* Mark all SMRn as invalid and all S2CRn as bypass */ + /* Mark all SMRn as invalid and all S2CRn as bypass unless overridden */ + reg = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS; for (i = 0; i < smmu->num_mapping_groups; ++i) { writel_relaxed(0, gr0_base + ARM_SMMU_GR0_SMR(i)); - writel_relaxed(S2CR_TYPE_BYPASS, - gr0_base + ARM_SMMU_GR0_S2CR(i)); + writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_S2CR(i)); } /* Make sure all context banks are disabled and clear CB_FSR */ @@ -1502,8 +1512,12 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) /* Disable TLB broadcasting. */ reg |= (sCR0_VMIDPNE | sCR0_PTM); - /* Enable client access, but bypass when no mapping is found */ - reg &= ~(sCR0_CLIENTPD | sCR0_USFCFG); + /* Enable client access, handling unmatched streams as appropriate */ + reg &= ~sCR0_CLIENTPD; + if (disable_bypass) + reg |= sCR0_USFCFG; + else + reg &= ~sCR0_USFCFG; /* Disable forced broadcasting */ reg &= ~sCR0_FB; -- 2.7.0.25.gfc10eb5.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
[parent not found: <10ebf5a136a787da54ffd1d6167953f82f01a834.1453830752.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>]
* Re: [PATCH 2/4] iommu/arm-smmu: Allow disabling unmatched stream bypass 2016-01-26 18:06 ` Robin Murphy @ 2016-02-09 14:06 ` Will Deacon -1 siblings, 0 replies; 32+ messages in thread From: Will Deacon @ 2016-02-09 14:06 UTC (permalink / raw) To: Robin Murphy Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, tchalamarla-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8 Hi Robin, On Tue, Jan 26, 2016 at 06:06:35PM +0000, Robin Murphy wrote: > Borrow the disable_bypass parameter from the SMMUv3 driver as a handy > debugging/security feature so that unmatched stream IDs (i.e. devices > not attached to an IOMMU domain) may be configured to fault. > > Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> > --- > drivers/iommu/arm-smmu.c | 32 +++++++++++++++++++++++--------- > 1 file changed, 23 insertions(+), 9 deletions(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 1f9093d..d1b7dc1 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -141,6 +141,8 @@ > #define ID2_PTFS_16K (1 << 13) > #define ID2_PTFS_64K (1 << 14) > > +#define sGFSR_USF (1 << 1) > + > /* Global TLB invalidation */ > #define ARM_SMMU_GR0_TLBIVMID 0x64 > #define ARM_SMMU_GR0_TLBIALLNSNH 0x68 > @@ -263,6 +265,10 @@ static int force_stage; > module_param_named(force_stage, force_stage, int, S_IRUGO); > MODULE_PARM_DESC(force_stage, > "Force SMMU mappings to be installed at a particular stage of translation. A value of '1' or '2' forces the corresponding stage. All other values are ignored (i.e. no stage is forced). Note that selecting a specific stage will disable support for nested translation."); > +static bool disable_bypass; > +module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO); > +MODULE_PARM_DESC(disable_bypass, > + "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU."); > > enum arm_smmu_arch_version { > ARM_SMMU_V1 = 1, > @@ -704,8 +710,12 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev) > if (!gfsr) > return IRQ_NONE; > > - dev_err_ratelimited(smmu->dev, > - "Unexpected global fault, this could be serious\n"); > + if (gfsr & sGFSR_USF) > + dev_err_ratelimited(smmu->dev, "Unmatched stream fault: ID 0x%x\n", > + gfsynr1 & SMR_ID_MASK); > + else I think I'd rather drop this message -- a global error is still indicative that something has gone horriby wrong, and we already print the gsynr registers below. Furthermore, if we take multiple faults, it might look like they're all exclusively due to unmatched streams. On top of that, I'm not convinced that USF will be set for an SMMU that doesn't implement stream-matching (because it will match an S2CR of type FAULT). Other than that, looks good. Will ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 2/4] iommu/arm-smmu: Allow disabling unmatched stream bypass @ 2016-02-09 14:06 ` Will Deacon 0 siblings, 0 replies; 32+ messages in thread From: Will Deacon @ 2016-02-09 14:06 UTC (permalink / raw) To: linux-arm-kernel Hi Robin, On Tue, Jan 26, 2016 at 06:06:35PM +0000, Robin Murphy wrote: > Borrow the disable_bypass parameter from the SMMUv3 driver as a handy > debugging/security feature so that unmatched stream IDs (i.e. devices > not attached to an IOMMU domain) may be configured to fault. > > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > --- > drivers/iommu/arm-smmu.c | 32 +++++++++++++++++++++++--------- > 1 file changed, 23 insertions(+), 9 deletions(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 1f9093d..d1b7dc1 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -141,6 +141,8 @@ > #define ID2_PTFS_16K (1 << 13) > #define ID2_PTFS_64K (1 << 14) > > +#define sGFSR_USF (1 << 1) > + > /* Global TLB invalidation */ > #define ARM_SMMU_GR0_TLBIVMID 0x64 > #define ARM_SMMU_GR0_TLBIALLNSNH 0x68 > @@ -263,6 +265,10 @@ static int force_stage; > module_param_named(force_stage, force_stage, int, S_IRUGO); > MODULE_PARM_DESC(force_stage, > "Force SMMU mappings to be installed at a particular stage of translation. A value of '1' or '2' forces the corresponding stage. All other values are ignored (i.e. no stage is forced). Note that selecting a specific stage will disable support for nested translation."); > +static bool disable_bypass; > +module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO); > +MODULE_PARM_DESC(disable_bypass, > + "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU."); > > enum arm_smmu_arch_version { > ARM_SMMU_V1 = 1, > @@ -704,8 +710,12 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev) > if (!gfsr) > return IRQ_NONE; > > - dev_err_ratelimited(smmu->dev, > - "Unexpected global fault, this could be serious\n"); > + if (gfsr & sGFSR_USF) > + dev_err_ratelimited(smmu->dev, "Unmatched stream fault: ID 0x%x\n", > + gfsynr1 & SMR_ID_MASK); > + else I think I'd rather drop this message -- a global error is still indicative that something has gone horriby wrong, and we already print the gsynr registers below. Furthermore, if we take multiple faults, it might look like they're all exclusively due to unmatched streams. On top of that, I'm not convinced that USF will be set for an SMMU that doesn't implement stream-matching (because it will match an S2CR of type FAULT). Other than that, looks good. Will ^ permalink raw reply [flat|nested] 32+ messages in thread
[parent not found: <20160209140631.GM22874-5wv7dgnIgG8@public.gmane.org>]
* Re: [PATCH 2/4] iommu/arm-smmu: Allow disabling unmatched stream bypass 2016-02-09 14:06 ` Will Deacon @ 2016-02-10 12:10 ` Robin Murphy -1 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-02-10 12:10 UTC (permalink / raw) To: Will Deacon Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, tchalamarla-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8 On 09/02/16 14:06, Will Deacon wrote: > Hi Robin, > > On Tue, Jan 26, 2016 at 06:06:35PM +0000, Robin Murphy wrote: >> Borrow the disable_bypass parameter from the SMMUv3 driver as a handy >> debugging/security feature so that unmatched stream IDs (i.e. devices >> not attached to an IOMMU domain) may be configured to fault. >> >> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> >> --- >> drivers/iommu/arm-smmu.c | 32 +++++++++++++++++++++++--------- >> 1 file changed, 23 insertions(+), 9 deletions(-) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index 1f9093d..d1b7dc1 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -141,6 +141,8 @@ >> #define ID2_PTFS_16K (1 << 13) >> #define ID2_PTFS_64K (1 << 14) >> >> +#define sGFSR_USF (1 << 1) >> + >> /* Global TLB invalidation */ >> #define ARM_SMMU_GR0_TLBIVMID 0x64 >> #define ARM_SMMU_GR0_TLBIALLNSNH 0x68 >> @@ -263,6 +265,10 @@ static int force_stage; >> module_param_named(force_stage, force_stage, int, S_IRUGO); >> MODULE_PARM_DESC(force_stage, >> "Force SMMU mappings to be installed at a particular stage of translation. A value of '1' or '2' forces the corresponding stage. All other values are ignored (i.e. no stage is forced). Note that selecting a specific stage will disable support for nested translation."); >> +static bool disable_bypass; >> +module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO); >> +MODULE_PARM_DESC(disable_bypass, >> + "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU."); >> >> enum arm_smmu_arch_version { >> ARM_SMMU_V1 = 1, >> @@ -704,8 +710,12 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev) >> if (!gfsr) >> return IRQ_NONE; >> >> - dev_err_ratelimited(smmu->dev, >> - "Unexpected global fault, this could be serious\n"); >> + if (gfsr & sGFSR_USF) >> + dev_err_ratelimited(smmu->dev, "Unmatched stream fault: ID 0x%x\n", >> + gfsynr1 & SMR_ID_MASK); >> + else > > I think I'd rather drop this message -- a global error is still indicative > that something has gone horriby wrong, and we already print the gsynr > registers below. Furthermore, if we take multiple faults, it might look > like they're all exclusively due to unmatched streams. > > On top of that, I'm not convinced that USF will be set for an SMMU that > doesn't implement stream-matching (because it will match an S2CR of type > FAULT). You're quite right, by the look of it the stream indexing case should report ICF rather than USF. My rationale for special-casing the message was that it's less of an "unexpected" fault when it's something you've specifically asked the driver for, but it also seems reasonable that someone turning on such an "I know what I'm doing" feature might be able to decode GFSR themselves. Considering the machinations of matching !MULTI and USF or ICF as appropriate just for the sake of printing a slightly reworded message, I'll drop this hunk. > Other than that, looks good. Thanks, Robin. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 2/4] iommu/arm-smmu: Allow disabling unmatched stream bypass @ 2016-02-10 12:10 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-02-10 12:10 UTC (permalink / raw) To: linux-arm-kernel On 09/02/16 14:06, Will Deacon wrote: > Hi Robin, > > On Tue, Jan 26, 2016 at 06:06:35PM +0000, Robin Murphy wrote: >> Borrow the disable_bypass parameter from the SMMUv3 driver as a handy >> debugging/security feature so that unmatched stream IDs (i.e. devices >> not attached to an IOMMU domain) may be configured to fault. >> >> Signed-off-by: Robin Murphy <robin.murphy@arm.com> >> --- >> drivers/iommu/arm-smmu.c | 32 +++++++++++++++++++++++--------- >> 1 file changed, 23 insertions(+), 9 deletions(-) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index 1f9093d..d1b7dc1 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -141,6 +141,8 @@ >> #define ID2_PTFS_16K (1 << 13) >> #define ID2_PTFS_64K (1 << 14) >> >> +#define sGFSR_USF (1 << 1) >> + >> /* Global TLB invalidation */ >> #define ARM_SMMU_GR0_TLBIVMID 0x64 >> #define ARM_SMMU_GR0_TLBIALLNSNH 0x68 >> @@ -263,6 +265,10 @@ static int force_stage; >> module_param_named(force_stage, force_stage, int, S_IRUGO); >> MODULE_PARM_DESC(force_stage, >> "Force SMMU mappings to be installed at a particular stage of translation. A value of '1' or '2' forces the corresponding stage. All other values are ignored (i.e. no stage is forced). Note that selecting a specific stage will disable support for nested translation."); >> +static bool disable_bypass; >> +module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO); >> +MODULE_PARM_DESC(disable_bypass, >> + "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU."); >> >> enum arm_smmu_arch_version { >> ARM_SMMU_V1 = 1, >> @@ -704,8 +710,12 @@ static irqreturn_t arm_smmu_global_fault(int irq, void *dev) >> if (!gfsr) >> return IRQ_NONE; >> >> - dev_err_ratelimited(smmu->dev, >> - "Unexpected global fault, this could be serious\n"); >> + if (gfsr & sGFSR_USF) >> + dev_err_ratelimited(smmu->dev, "Unmatched stream fault: ID 0x%x\n", >> + gfsynr1 & SMR_ID_MASK); >> + else > > I think I'd rather drop this message -- a global error is still indicative > that something has gone horriby wrong, and we already print the gsynr > registers below. Furthermore, if we take multiple faults, it might look > like they're all exclusively due to unmatched streams. > > On top of that, I'm not convinced that USF will be set for an SMMU that > doesn't implement stream-matching (because it will match an S2CR of type > FAULT). You're quite right, by the look of it the stream indexing case should report ICF rather than USF. My rationale for special-casing the message was that it's less of an "unexpected" fault when it's something you've specifically asked the driver for, but it also seems reasonable that someone turning on such an "I know what I'm doing" feature might be able to decode GFSR themselves. Considering the machinations of matching !MULTI and USF or ICF as appropriate just for the sake of printing a slightly reworded message, I'll drop this hunk. > Other than that, looks good. Thanks, Robin. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v2] iommu/arm-smmu: Allow disabling unmatched stream bypass 2016-02-09 14:06 ` Will Deacon @ 2016-02-10 14:25 ` Robin Murphy -1 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-02-10 14:25 UTC (permalink / raw) To: will.deacon-5wv7dgnIgG8 Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r Borrow the disable_bypass parameter from the SMMUv3 driver as a handy debugging/security feature so that unmatched stream IDs (i.e. devices not attached to an IOMMU domain) may be configured to fault. Rather than introduce unsightly inconsistency, or repeat the existing unnecessary use of module_param_named(), fix that as well in passing. Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> --- v2: - Don't bother trying to report the fault specially. - Realise that calling module_param_named() with identical variable and parameter names is silly. drivers/iommu/arm-smmu.c | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 1f9093d..cf1e330 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -260,9 +260,13 @@ #define FSYNR0_WNR (1 << 4) static int force_stage; -module_param_named(force_stage, force_stage, int, S_IRUGO); +module_param(force_stage, int, S_IRUGO); MODULE_PARM_DESC(force_stage, "Force SMMU mappings to be installed at a particular stage of translation. A value of '1' or '2' forces the corresponding stage. All other values are ignored (i.e. no stage is forced). Note that selecting a specific stage will disable support for nested translation."); +static bool disable_bypass; +module_param(disable_bypass, bool, S_IRUGO); +MODULE_PARM_DESC(disable_bypass, + "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU."); enum arm_smmu_arch_version { ARM_SMMU_V1 = 1, @@ -1111,9 +1115,9 @@ static void arm_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain, */ for (i = 0; i < cfg->num_streamids; ++i) { u32 idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; + u32 reg = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS; - writel_relaxed(S2CR_TYPE_BYPASS, - gr0_base + ARM_SMMU_GR0_S2CR(idx)); + writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_S2CR(idx)); } arm_smmu_master_free_smrs(smmu, cfg); @@ -1476,11 +1480,11 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) reg = readl_relaxed(ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sGFSR); writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sGFSR); - /* Mark all SMRn as invalid and all S2CRn as bypass */ + /* Mark all SMRn as invalid and all S2CRn as bypass unless overridden */ + reg = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS; for (i = 0; i < smmu->num_mapping_groups; ++i) { writel_relaxed(0, gr0_base + ARM_SMMU_GR0_SMR(i)); - writel_relaxed(S2CR_TYPE_BYPASS, - gr0_base + ARM_SMMU_GR0_S2CR(i)); + writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_S2CR(i)); } /* Make sure all context banks are disabled and clear CB_FSR */ @@ -1502,8 +1506,12 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) /* Disable TLB broadcasting. */ reg |= (sCR0_VMIDPNE | sCR0_PTM); - /* Enable client access, but bypass when no mapping is found */ - reg &= ~(sCR0_CLIENTPD | sCR0_USFCFG); + /* Enable client access, handling unmatched streams as appropriate */ + reg &= ~sCR0_CLIENTPD; + if (disable_bypass) + reg |= sCR0_USFCFG; + else + reg &= ~sCR0_USFCFG; /* Disable forced broadcasting */ reg &= ~sCR0_FB; -- 2.7.0.25.gfc10eb5.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v2] iommu/arm-smmu: Allow disabling unmatched stream bypass @ 2016-02-10 14:25 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-02-10 14:25 UTC (permalink / raw) To: linux-arm-kernel Borrow the disable_bypass parameter from the SMMUv3 driver as a handy debugging/security feature so that unmatched stream IDs (i.e. devices not attached to an IOMMU domain) may be configured to fault. Rather than introduce unsightly inconsistency, or repeat the existing unnecessary use of module_param_named(), fix that as well in passing. Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- v2: - Don't bother trying to report the fault specially. - Realise that calling module_param_named() with identical variable and parameter names is silly. drivers/iommu/arm-smmu.c | 24 ++++++++++++++++-------- 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 1f9093d..cf1e330 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -260,9 +260,13 @@ #define FSYNR0_WNR (1 << 4) static int force_stage; -module_param_named(force_stage, force_stage, int, S_IRUGO); +module_param(force_stage, int, S_IRUGO); MODULE_PARM_DESC(force_stage, "Force SMMU mappings to be installed@a particular stage of translation. A value of '1' or '2' forces the corresponding stage. All other values are ignored (i.e. no stage is forced). Note that selecting a specific stage will disable support for nested translation."); +static bool disable_bypass; +module_param(disable_bypass, bool, S_IRUGO); +MODULE_PARM_DESC(disable_bypass, + "Disable bypass streams such that incoming transactions from devices that are not attached to an iommu domain will report an abort back to the device and will not be allowed to pass through the SMMU."); enum arm_smmu_arch_version { ARM_SMMU_V1 = 1, @@ -1111,9 +1115,9 @@ static void arm_smmu_domain_remove_master(struct arm_smmu_domain *smmu_domain, */ for (i = 0; i < cfg->num_streamids; ++i) { u32 idx = cfg->smrs ? cfg->smrs[i].idx : cfg->streamids[i]; + u32 reg = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS; - writel_relaxed(S2CR_TYPE_BYPASS, - gr0_base + ARM_SMMU_GR0_S2CR(idx)); + writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_S2CR(idx)); } arm_smmu_master_free_smrs(smmu, cfg); @@ -1476,11 +1480,11 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) reg = readl_relaxed(ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sGFSR); writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sGFSR); - /* Mark all SMRn as invalid and all S2CRn as bypass */ + /* Mark all SMRn as invalid and all S2CRn as bypass unless overridden */ + reg = disable_bypass ? S2CR_TYPE_FAULT : S2CR_TYPE_BYPASS; for (i = 0; i < smmu->num_mapping_groups; ++i) { writel_relaxed(0, gr0_base + ARM_SMMU_GR0_SMR(i)); - writel_relaxed(S2CR_TYPE_BYPASS, - gr0_base + ARM_SMMU_GR0_S2CR(i)); + writel_relaxed(reg, gr0_base + ARM_SMMU_GR0_S2CR(i)); } /* Make sure all context banks are disabled and clear CB_FSR */ @@ -1502,8 +1506,12 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) /* Disable TLB broadcasting. */ reg |= (sCR0_VMIDPNE | sCR0_PTM); - /* Enable client access, but bypass when no mapping is found */ - reg &= ~(sCR0_CLIENTPD | sCR0_USFCFG); + /* Enable client access, handling unmatched streams as appropriate */ + reg &= ~sCR0_CLIENTPD; + if (disable_bypass) + reg |= sCR0_USFCFG; + else + reg &= ~sCR0_USFCFG; /* Disable forced broadcasting */ reg &= ~sCR0_FB; -- 2.7.0.25.gfc10eb5.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 3/4] iommu/arm-smmu: Support DMA-API domains 2016-01-26 18:06 ` Robin Murphy @ 2016-01-26 18:06 ` Robin Murphy -1 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-01-26 18:06 UTC (permalink / raw) To: will.deacon-5wv7dgnIgG8 Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, tchalamarla-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8 With DMA mapping ops provided by the iommu-dma code, only a minimal contribution from the IOMMU driver is needed to create a suitable DMA-API domain for them to use. Implement this for the ARM SMMUs. Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> --- drivers/iommu/arm-smmu-v3.c | 10 +++++++++- drivers/iommu/arm-smmu.c | 10 +++++++++- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 2087534..f003b3a 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -21,6 +21,7 @@ */ #include <linux/delay.h> +#include <linux/dma-iommu.h> #include <linux/err.h> #include <linux/interrupt.h> #include <linux/iommu.h> @@ -1396,7 +1397,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) { struct arm_smmu_domain *smmu_domain; - if (type != IOMMU_DOMAIN_UNMANAGED) + if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA) return NULL; /* @@ -1408,6 +1409,12 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) if (!smmu_domain) return NULL; + if (type == IOMMU_DOMAIN_DMA && + iommu_get_dma_cookie(&smmu_domain->domain)) { + kfree(smmu_domain); + return NULL; + } + mutex_init(&smmu_domain->init_mutex); spin_lock_init(&smmu_domain->pgtbl_lock); return &smmu_domain->domain; @@ -1436,6 +1443,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_device *smmu = smmu_domain->smmu; + iommu_put_dma_cookie(domain); free_io_pgtable_ops(smmu_domain->pgtbl_ops); /* Free the CD and ASID, if we allocated them */ diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index d1b7dc1..18e0e10 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -29,6 +29,7 @@ #define pr_fmt(fmt) "arm-smmu: " fmt #include <linux/delay.h> +#include <linux/dma-iommu.h> #include <linux/dma-mapping.h> #include <linux/err.h> #include <linux/interrupt.h> @@ -976,7 +977,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) { struct arm_smmu_domain *smmu_domain; - if (type != IOMMU_DOMAIN_UNMANAGED) + if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA) return NULL; /* * Allocate the domain and initialise some of its data structures. @@ -987,6 +988,12 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) if (!smmu_domain) return NULL; + if (type == IOMMU_DOMAIN_DMA && + iommu_get_dma_cookie(&smmu_domain->domain)) { + kfree(smmu_domain); + return NULL; + } + mutex_init(&smmu_domain->init_mutex); spin_lock_init(&smmu_domain->pgtbl_lock); @@ -1001,6 +1008,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) * Free the domain resources. We assume that all devices have * already been detached. */ + iommu_put_dma_cookie(domain); arm_smmu_destroy_domain_context(domain); kfree(smmu_domain); } -- 2.7.0.25.gfc10eb5.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 3/4] iommu/arm-smmu: Support DMA-API domains @ 2016-01-26 18:06 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-01-26 18:06 UTC (permalink / raw) To: linux-arm-kernel With DMA mapping ops provided by the iommu-dma code, only a minimal contribution from the IOMMU driver is needed to create a suitable DMA-API domain for them to use. Implement this for the ARM SMMUs. Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- drivers/iommu/arm-smmu-v3.c | 10 +++++++++- drivers/iommu/arm-smmu.c | 10 +++++++++- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 2087534..f003b3a 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -21,6 +21,7 @@ */ #include <linux/delay.h> +#include <linux/dma-iommu.h> #include <linux/err.h> #include <linux/interrupt.h> #include <linux/iommu.h> @@ -1396,7 +1397,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) { struct arm_smmu_domain *smmu_domain; - if (type != IOMMU_DOMAIN_UNMANAGED) + if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA) return NULL; /* @@ -1408,6 +1409,12 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) if (!smmu_domain) return NULL; + if (type == IOMMU_DOMAIN_DMA && + iommu_get_dma_cookie(&smmu_domain->domain)) { + kfree(smmu_domain); + return NULL; + } + mutex_init(&smmu_domain->init_mutex); spin_lock_init(&smmu_domain->pgtbl_lock); return &smmu_domain->domain; @@ -1436,6 +1443,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_device *smmu = smmu_domain->smmu; + iommu_put_dma_cookie(domain); free_io_pgtable_ops(smmu_domain->pgtbl_ops); /* Free the CD and ASID, if we allocated them */ diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index d1b7dc1..18e0e10 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -29,6 +29,7 @@ #define pr_fmt(fmt) "arm-smmu: " fmt #include <linux/delay.h> +#include <linux/dma-iommu.h> #include <linux/dma-mapping.h> #include <linux/err.h> #include <linux/interrupt.h> @@ -976,7 +977,7 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) { struct arm_smmu_domain *smmu_domain; - if (type != IOMMU_DOMAIN_UNMANAGED) + if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA) return NULL; /* * Allocate the domain and initialise some of its data structures. @@ -987,6 +988,12 @@ static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) if (!smmu_domain) return NULL; + if (type == IOMMU_DOMAIN_DMA && + iommu_get_dma_cookie(&smmu_domain->domain)) { + kfree(smmu_domain); + return NULL; + } + mutex_init(&smmu_domain->init_mutex); spin_lock_init(&smmu_domain->pgtbl_lock); @@ -1001,6 +1008,7 @@ static void arm_smmu_domain_free(struct iommu_domain *domain) * Free the domain resources. We assume that all devices have * already been detached. */ + iommu_put_dma_cookie(domain); arm_smmu_destroy_domain_context(domain); kfree(smmu_domain); } -- 2.7.0.25.gfc10eb5.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate 2016-01-26 18:06 ` Robin Murphy @ 2016-01-26 18:06 ` Robin Murphy -1 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-01-26 18:06 UTC (permalink / raw) To: will.deacon-5wv7dgnIgG8 Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, tchalamarla-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8 TLB synchronisation is a mighty big hammmer to bring down on the transaction stream, typically stalling all in-flight transactions until the sync completes. Since in most cases (except at stage 2 on SMMUv1) a per-context sync operation is available, prefer that over the global operation when performing TLB maintenance for a single domain, to avoid unecessarily disrupting ongoing traffic in other contexts. Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> --- drivers/iommu/arm-smmu.c | 32 ++++++++++++++++++++++++-------- 1 file changed, 24 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 18e0e10..bf1895c 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -219,6 +219,8 @@ #define ARM_SMMU_CB_S1_TLBIVAL 0x620 #define ARM_SMMU_CB_S2_TLBIIPAS2 0x630 #define ARM_SMMU_CB_S2_TLBIIPAS2L 0x638 +#define ARM_SMMU_CB_TLBSYNC 0x7f0 +#define ARM_SMMU_CB_TLBSTATUS 0x7f4 #define ARM_SMMU_CB_ATS1PR 0x800 #define ARM_SMMU_CB_ATSR 0x8f0 @@ -546,14 +548,22 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx) } /* Wait for any pending TLB invalidations to complete */ -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, int cbndx) { int count = 0; - void __iomem *gr0_base = ARM_SMMU_GR0(smmu); + void __iomem *base, __iomem *status; - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_sTLBGSYNC); - while (readl_relaxed(gr0_base + ARM_SMMU_GR0_sTLBGSTATUS) - & sTLBGSTATUS_GSACTIVE) { + if (cbndx < 0) { + base = ARM_SMMU_GR0(smmu); + status = base + ARM_SMMU_GR0_sTLBGSTATUS; + writel_relaxed(0, base + ARM_SMMU_GR0_sTLBGSYNC); + } else { + base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cbndx); + status = base + ARM_SMMU_CB_TLBSTATUS; + writel_relaxed(0, base + ARM_SMMU_CB_TLBSYNC); + } + + while (readl_relaxed(status) & sTLBGSTATUS_GSACTIVE) { cpu_relax(); if (++count == TLB_LOOP_TIMEOUT) { dev_err_ratelimited(smmu->dev, @@ -567,7 +577,13 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) static void arm_smmu_tlb_sync(void *cookie) { struct arm_smmu_domain *smmu_domain = cookie; - __arm_smmu_tlb_sync(smmu_domain->smmu); + int cbndx = smmu_domain->cfg.cbndx; + + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2 && + smmu_domain->smmu->version < ARM_SMMU_V2) + cbndx = -1; + + __arm_smmu_tlb_sync(smmu_domain->smmu, cbndx); } static void arm_smmu_tlb_inv_context(void *cookie) @@ -588,7 +604,7 @@ static void arm_smmu_tlb_inv_context(void *cookie) base + ARM_SMMU_GR0_TLBIVMID); } - __arm_smmu_tlb_sync(smmu); + __arm_smmu_tlb_sync(smmu, cfg->cbndx); } static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, @@ -1534,7 +1550,7 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) reg &= ~(sCR0_BSU_MASK << sCR0_BSU_SHIFT); /* Push the button */ - __arm_smmu_tlb_sync(smmu); + __arm_smmu_tlb_sync(smmu, -1); writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); } -- 2.7.0.25.gfc10eb5.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate @ 2016-01-26 18:06 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-01-26 18:06 UTC (permalink / raw) To: linux-arm-kernel TLB synchronisation is a mighty big hammmer to bring down on the transaction stream, typically stalling all in-flight transactions until the sync completes. Since in most cases (except at stage 2 on SMMUv1) a per-context sync operation is available, prefer that over the global operation when performing TLB maintenance for a single domain, to avoid unecessarily disrupting ongoing traffic in other contexts. Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- drivers/iommu/arm-smmu.c | 32 ++++++++++++++++++++++++-------- 1 file changed, 24 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 18e0e10..bf1895c 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -219,6 +219,8 @@ #define ARM_SMMU_CB_S1_TLBIVAL 0x620 #define ARM_SMMU_CB_S2_TLBIIPAS2 0x630 #define ARM_SMMU_CB_S2_TLBIIPAS2L 0x638 +#define ARM_SMMU_CB_TLBSYNC 0x7f0 +#define ARM_SMMU_CB_TLBSTATUS 0x7f4 #define ARM_SMMU_CB_ATS1PR 0x800 #define ARM_SMMU_CB_ATSR 0x8f0 @@ -546,14 +548,22 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx) } /* Wait for any pending TLB invalidations to complete */ -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, int cbndx) { int count = 0; - void __iomem *gr0_base = ARM_SMMU_GR0(smmu); + void __iomem *base, __iomem *status; - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_sTLBGSYNC); - while (readl_relaxed(gr0_base + ARM_SMMU_GR0_sTLBGSTATUS) - & sTLBGSTATUS_GSACTIVE) { + if (cbndx < 0) { + base = ARM_SMMU_GR0(smmu); + status = base + ARM_SMMU_GR0_sTLBGSTATUS; + writel_relaxed(0, base + ARM_SMMU_GR0_sTLBGSYNC); + } else { + base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cbndx); + status = base + ARM_SMMU_CB_TLBSTATUS; + writel_relaxed(0, base + ARM_SMMU_CB_TLBSYNC); + } + + while (readl_relaxed(status) & sTLBGSTATUS_GSACTIVE) { cpu_relax(); if (++count == TLB_LOOP_TIMEOUT) { dev_err_ratelimited(smmu->dev, @@ -567,7 +577,13 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) static void arm_smmu_tlb_sync(void *cookie) { struct arm_smmu_domain *smmu_domain = cookie; - __arm_smmu_tlb_sync(smmu_domain->smmu); + int cbndx = smmu_domain->cfg.cbndx; + + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2 && + smmu_domain->smmu->version < ARM_SMMU_V2) + cbndx = -1; + + __arm_smmu_tlb_sync(smmu_domain->smmu, cbndx); } static void arm_smmu_tlb_inv_context(void *cookie) @@ -588,7 +604,7 @@ static void arm_smmu_tlb_inv_context(void *cookie) base + ARM_SMMU_GR0_TLBIVMID); } - __arm_smmu_tlb_sync(smmu); + __arm_smmu_tlb_sync(smmu, cfg->cbndx); } static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, @@ -1534,7 +1550,7 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) reg &= ~(sCR0_BSU_MASK << sCR0_BSU_SHIFT); /* Push the button */ - __arm_smmu_tlb_sync(smmu); + __arm_smmu_tlb_sync(smmu, -1); writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); } -- 2.7.0.25.gfc10eb5.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
[parent not found: <fc2ede950aea2e84a0b7cf93e4cd605f0cbdf3c3.1453830752.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>]
* Re: [PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate 2016-01-26 18:06 ` Robin Murphy @ 2016-02-09 14:15 ` Will Deacon -1 siblings, 0 replies; 32+ messages in thread From: Will Deacon @ 2016-02-09 14:15 UTC (permalink / raw) To: Robin Murphy Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, tchalamarla-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8 On Tue, Jan 26, 2016 at 06:06:37PM +0000, Robin Murphy wrote: > TLB synchronisation is a mighty big hammmer to bring down on the > transaction stream, typically stalling all in-flight transactions until > the sync completes. Since in most cases (except at stage 2 on SMMUv1) > a per-context sync operation is available, prefer that over the global > operation when performing TLB maintenance for a single domain, to avoid > unecessarily disrupting ongoing traffic in other contexts. > > Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> > --- > drivers/iommu/arm-smmu.c | 32 ++++++++++++++++++++++++-------- > 1 file changed, 24 insertions(+), 8 deletions(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 18e0e10..bf1895c 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -219,6 +219,8 @@ > #define ARM_SMMU_CB_S1_TLBIVAL 0x620 > #define ARM_SMMU_CB_S2_TLBIIPAS2 0x630 > #define ARM_SMMU_CB_S2_TLBIIPAS2L 0x638 > +#define ARM_SMMU_CB_TLBSYNC 0x7f0 > +#define ARM_SMMU_CB_TLBSTATUS 0x7f4 > #define ARM_SMMU_CB_ATS1PR 0x800 > #define ARM_SMMU_CB_ATSR 0x8f0 > > @@ -546,14 +548,22 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx) > } > > /* Wait for any pending TLB invalidations to complete */ > -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) > +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, int cbndx) > { > int count = 0; > - void __iomem *gr0_base = ARM_SMMU_GR0(smmu); > + void __iomem *base, __iomem *status; > > - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_sTLBGSYNC); > - while (readl_relaxed(gr0_base + ARM_SMMU_GR0_sTLBGSTATUS) > - & sTLBGSTATUS_GSACTIVE) { > + if (cbndx < 0) { > + base = ARM_SMMU_GR0(smmu); > + status = base + ARM_SMMU_GR0_sTLBGSTATUS; > + writel_relaxed(0, base + ARM_SMMU_GR0_sTLBGSYNC); > + } else { > + base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cbndx); > + status = base + ARM_SMMU_CB_TLBSTATUS; > + writel_relaxed(0, base + ARM_SMMU_CB_TLBSYNC); > + } > + > + while (readl_relaxed(status) & sTLBGSTATUS_GSACTIVE) { > cpu_relax(); > if (++count == TLB_LOOP_TIMEOUT) { > dev_err_ratelimited(smmu->dev, > @@ -567,7 +577,13 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) > static void arm_smmu_tlb_sync(void *cookie) > { > struct arm_smmu_domain *smmu_domain = cookie; > - __arm_smmu_tlb_sync(smmu_domain->smmu); > + int cbndx = smmu_domain->cfg.cbndx; > + > + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2 && > + smmu_domain->smmu->version < ARM_SMMU_V2) > + cbndx = -1; I think it would be cleaner just to override the sync function pointer when we initialise a stage-2 page table for an SMMUv1 implementation. Any reason not to go that way? Will ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate @ 2016-02-09 14:15 ` Will Deacon 0 siblings, 0 replies; 32+ messages in thread From: Will Deacon @ 2016-02-09 14:15 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jan 26, 2016 at 06:06:37PM +0000, Robin Murphy wrote: > TLB synchronisation is a mighty big hammmer to bring down on the > transaction stream, typically stalling all in-flight transactions until > the sync completes. Since in most cases (except at stage 2 on SMMUv1) > a per-context sync operation is available, prefer that over the global > operation when performing TLB maintenance for a single domain, to avoid > unecessarily disrupting ongoing traffic in other contexts. > > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > --- > drivers/iommu/arm-smmu.c | 32 ++++++++++++++++++++++++-------- > 1 file changed, 24 insertions(+), 8 deletions(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 18e0e10..bf1895c 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -219,6 +219,8 @@ > #define ARM_SMMU_CB_S1_TLBIVAL 0x620 > #define ARM_SMMU_CB_S2_TLBIIPAS2 0x630 > #define ARM_SMMU_CB_S2_TLBIIPAS2L 0x638 > +#define ARM_SMMU_CB_TLBSYNC 0x7f0 > +#define ARM_SMMU_CB_TLBSTATUS 0x7f4 > #define ARM_SMMU_CB_ATS1PR 0x800 > #define ARM_SMMU_CB_ATSR 0x8f0 > > @@ -546,14 +548,22 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx) > } > > /* Wait for any pending TLB invalidations to complete */ > -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) > +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, int cbndx) > { > int count = 0; > - void __iomem *gr0_base = ARM_SMMU_GR0(smmu); > + void __iomem *base, __iomem *status; > > - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_sTLBGSYNC); > - while (readl_relaxed(gr0_base + ARM_SMMU_GR0_sTLBGSTATUS) > - & sTLBGSTATUS_GSACTIVE) { > + if (cbndx < 0) { > + base = ARM_SMMU_GR0(smmu); > + status = base + ARM_SMMU_GR0_sTLBGSTATUS; > + writel_relaxed(0, base + ARM_SMMU_GR0_sTLBGSYNC); > + } else { > + base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cbndx); > + status = base + ARM_SMMU_CB_TLBSTATUS; > + writel_relaxed(0, base + ARM_SMMU_CB_TLBSYNC); > + } > + > + while (readl_relaxed(status) & sTLBGSTATUS_GSACTIVE) { > cpu_relax(); > if (++count == TLB_LOOP_TIMEOUT) { > dev_err_ratelimited(smmu->dev, > @@ -567,7 +577,13 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) > static void arm_smmu_tlb_sync(void *cookie) > { > struct arm_smmu_domain *smmu_domain = cookie; > - __arm_smmu_tlb_sync(smmu_domain->smmu); > + int cbndx = smmu_domain->cfg.cbndx; > + > + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2 && > + smmu_domain->smmu->version < ARM_SMMU_V2) > + cbndx = -1; I think it would be cleaner just to override the sync function pointer when we initialise a stage-2 page table for an SMMUv1 implementation. Any reason not to go that way? Will ^ permalink raw reply [flat|nested] 32+ messages in thread
[parent not found: <20160209141508.GO22874-5wv7dgnIgG8@public.gmane.org>]
* Re: [PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate 2016-02-09 14:15 ` Will Deacon @ 2016-02-10 11:58 ` Robin Murphy -1 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-02-10 11:58 UTC (permalink / raw) To: Will Deacon Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, tchalamarla-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8 On 09/02/16 14:15, Will Deacon wrote: > On Tue, Jan 26, 2016 at 06:06:37PM +0000, Robin Murphy wrote: >> TLB synchronisation is a mighty big hammmer to bring down on the >> transaction stream, typically stalling all in-flight transactions until >> the sync completes. Since in most cases (except at stage 2 on SMMUv1) >> a per-context sync operation is available, prefer that over the global >> operation when performing TLB maintenance for a single domain, to avoid >> unecessarily disrupting ongoing traffic in other contexts. >> >> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> >> --- >> drivers/iommu/arm-smmu.c | 32 ++++++++++++++++++++++++-------- >> 1 file changed, 24 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index 18e0e10..bf1895c 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -219,6 +219,8 @@ >> #define ARM_SMMU_CB_S1_TLBIVAL 0x620 >> #define ARM_SMMU_CB_S2_TLBIIPAS2 0x630 >> #define ARM_SMMU_CB_S2_TLBIIPAS2L 0x638 >> +#define ARM_SMMU_CB_TLBSYNC 0x7f0 >> +#define ARM_SMMU_CB_TLBSTATUS 0x7f4 >> #define ARM_SMMU_CB_ATS1PR 0x800 >> #define ARM_SMMU_CB_ATSR 0x8f0 >> >> @@ -546,14 +548,22 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx) >> } >> >> /* Wait for any pending TLB invalidations to complete */ >> -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) >> +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, int cbndx) >> { >> int count = 0; >> - void __iomem *gr0_base = ARM_SMMU_GR0(smmu); >> + void __iomem *base, __iomem *status; >> >> - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_sTLBGSYNC); >> - while (readl_relaxed(gr0_base + ARM_SMMU_GR0_sTLBGSTATUS) >> - & sTLBGSTATUS_GSACTIVE) { >> + if (cbndx < 0) { >> + base = ARM_SMMU_GR0(smmu); >> + status = base + ARM_SMMU_GR0_sTLBGSTATUS; >> + writel_relaxed(0, base + ARM_SMMU_GR0_sTLBGSYNC); >> + } else { >> + base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cbndx); >> + status = base + ARM_SMMU_CB_TLBSTATUS; >> + writel_relaxed(0, base + ARM_SMMU_CB_TLBSYNC); >> + } >> + >> + while (readl_relaxed(status) & sTLBGSTATUS_GSACTIVE) { >> cpu_relax(); >> if (++count == TLB_LOOP_TIMEOUT) { >> dev_err_ratelimited(smmu->dev, >> @@ -567,7 +577,13 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) >> static void arm_smmu_tlb_sync(void *cookie) >> { >> struct arm_smmu_domain *smmu_domain = cookie; >> - __arm_smmu_tlb_sync(smmu_domain->smmu); >> + int cbndx = smmu_domain->cfg.cbndx; >> + >> + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2 && >> + smmu_domain->smmu->version < ARM_SMMU_V2) >> + cbndx = -1; > > I think it would be cleaner just to override the sync function pointer > when we initialise a stage-2 page table for an SMMUv1 implementation. > > Any reason not to go that way? Frankly, the idea just didn't occur to me. Looking more closely, if we were to simply put a set of iommu_gather_ops pointers in each domain based on the format, we could then also break up the confusing mess of arm_smmu_tlb_inv_range_nosync(), which would be nice. I'll give that a go on top of the context format series I'm preparing (with which it would otherwise conflict horribly) and see how it looks. Robin. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate @ 2016-02-10 11:58 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2016-02-10 11:58 UTC (permalink / raw) To: linux-arm-kernel On 09/02/16 14:15, Will Deacon wrote: > On Tue, Jan 26, 2016 at 06:06:37PM +0000, Robin Murphy wrote: >> TLB synchronisation is a mighty big hammmer to bring down on the >> transaction stream, typically stalling all in-flight transactions until >> the sync completes. Since in most cases (except at stage 2 on SMMUv1) >> a per-context sync operation is available, prefer that over the global >> operation when performing TLB maintenance for a single domain, to avoid >> unecessarily disrupting ongoing traffic in other contexts. >> >> Signed-off-by: Robin Murphy <robin.murphy@arm.com> >> --- >> drivers/iommu/arm-smmu.c | 32 ++++++++++++++++++++++++-------- >> 1 file changed, 24 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index 18e0e10..bf1895c 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -219,6 +219,8 @@ >> #define ARM_SMMU_CB_S1_TLBIVAL 0x620 >> #define ARM_SMMU_CB_S2_TLBIIPAS2 0x630 >> #define ARM_SMMU_CB_S2_TLBIIPAS2L 0x638 >> +#define ARM_SMMU_CB_TLBSYNC 0x7f0 >> +#define ARM_SMMU_CB_TLBSTATUS 0x7f4 >> #define ARM_SMMU_CB_ATS1PR 0x800 >> #define ARM_SMMU_CB_ATSR 0x8f0 >> >> @@ -546,14 +548,22 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx) >> } >> >> /* Wait for any pending TLB invalidations to complete */ >> -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) >> +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, int cbndx) >> { >> int count = 0; >> - void __iomem *gr0_base = ARM_SMMU_GR0(smmu); >> + void __iomem *base, __iomem *status; >> >> - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_sTLBGSYNC); >> - while (readl_relaxed(gr0_base + ARM_SMMU_GR0_sTLBGSTATUS) >> - & sTLBGSTATUS_GSACTIVE) { >> + if (cbndx < 0) { >> + base = ARM_SMMU_GR0(smmu); >> + status = base + ARM_SMMU_GR0_sTLBGSTATUS; >> + writel_relaxed(0, base + ARM_SMMU_GR0_sTLBGSYNC); >> + } else { >> + base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cbndx); >> + status = base + ARM_SMMU_CB_TLBSTATUS; >> + writel_relaxed(0, base + ARM_SMMU_CB_TLBSYNC); >> + } >> + >> + while (readl_relaxed(status) & sTLBGSTATUS_GSACTIVE) { >> cpu_relax(); >> if (++count == TLB_LOOP_TIMEOUT) { >> dev_err_ratelimited(smmu->dev, >> @@ -567,7 +577,13 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) >> static void arm_smmu_tlb_sync(void *cookie) >> { >> struct arm_smmu_domain *smmu_domain = cookie; >> - __arm_smmu_tlb_sync(smmu_domain->smmu); >> + int cbndx = smmu_domain->cfg.cbndx; >> + >> + if (smmu_domain->stage == ARM_SMMU_DOMAIN_S2 && >> + smmu_domain->smmu->version < ARM_SMMU_V2) >> + cbndx = -1; > > I think it would be cleaner just to override the sync function pointer > when we initialise a stage-2 page table for an SMMUv1 implementation. > > Any reason not to go that way? Frankly, the idea just didn't occur to me. Looking more closely, if we were to simply put a set of iommu_gather_ops pointers in each domain based on the format, we could then also break up the confusing mess of arm_smmu_tlb_inv_range_nosync(), which would be nice. I'll give that a go on top of the context format series I'm preparing (with which it would otherwise conflict horribly) and see how it looks. Robin. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/4] Miscellaneous ARM SMMU patches 2016-01-26 18:06 ` Robin Murphy @ 2016-02-09 14:16 ` Will Deacon -1 siblings, 0 replies; 32+ messages in thread From: Will Deacon @ 2016-02-09 14:16 UTC (permalink / raw) To: Robin Murphy Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r, tchalamarla-M3mlKVOIwJVv6pq1l3V1OdBPR1lH4CV8 On Tue, Jan 26, 2016 at 06:06:33PM +0000, Robin Murphy wrote: > Here's my current "miscellaneous SMMU enhancements" branch for your > consideration. Patch #1 solves a subtle corner case that cropped up > while exercising stage 1 context formats on the DMA330-MMU500 model; > #3 will be wanted fairly soon for DMA ops integration so may as well > get some exposure now; #2 and #4 are more nice-to-haves. Cheers, Robin. I've queued #1 and #3 for 4.6. If you address the comments on the other two, I can queue those as well. Will ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 0/4] Miscellaneous ARM SMMU patches @ 2016-02-09 14:16 ` Will Deacon 0 siblings, 0 replies; 32+ messages in thread From: Will Deacon @ 2016-02-09 14:16 UTC (permalink / raw) To: linux-arm-kernel On Tue, Jan 26, 2016 at 06:06:33PM +0000, Robin Murphy wrote: > Here's my current "miscellaneous SMMU enhancements" branch for your > consideration. Patch #1 solves a subtle corner case that cropped up > while exercising stage 1 context formats on the DMA330-MMU500 model; > #3 will be wanted fairly soon for DMA ops integration so may as well > get some exposure now; #2 and #4 are more nice-to-haves. Cheers, Robin. I've queued #1 and #3 for 4.6. If you address the comments on the other two, I can queue those as well. Will ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 0/4] ARM SMMU per-context TLB sync @ 2017-03-07 18:09 Robin Murphy [not found] ` <cover.1488907474.git.robin.murphy-5wv7dgnIgG8@public.gmane.org> 0 siblings, 1 reply; 32+ messages in thread From: Robin Murphy @ 2017-03-07 18:09 UTC (permalink / raw) To: will.deacon-5wv7dgnIgG8 Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r The discussion around context-level access for Qualcomm SMMUs reminded me to dig up this patch I started ages ago and finish it off. As it's ended up, it's now a mini-series, with some new preparatory cleanup manifesting as patches 2 and 3. Patch 1 is broken out of patch 3 for clarity as somewhat of a fix in its own right, in that it's really an entirely unrelated thing which came up in parallel, but happens to be inherent to code I'm touching here anyway. Robin. Robin Murphy (4): iommu/arm-smmu: Handle size mismatches better iommu/arm-smmu: Simplify ASID/VMID handling iommu/arm-smmu: Tidy up context bank indexing iommu/arm-smmu: Use per-context TLB sync as appropriate drivers/iommu/arm-smmu.c | 207 +++++++++++++++++++++++++++++------------------ 1 file changed, 127 insertions(+), 80 deletions(-) -- 2.11.0.dirty ^ permalink raw reply [flat|nested] 32+ messages in thread
[parent not found: <cover.1488907474.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>]
* [PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate 2017-03-07 18:09 [PATCH 0/4] ARM SMMU per-context TLB sync Robin Murphy @ 2017-03-07 18:09 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2017-03-07 18:09 UTC (permalink / raw) To: will.deacon-5wv7dgnIgG8 Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r TLB synchronisation typically involves the SMMU blocking all incoming transactions until the TLBs report completion of all outstanding operations. In the common SMMUv2 configuration of a single distributed SMMU serving multiple peripherals, that means that a single unmap request has the potential to bring the hammer down on the entire system if synchronised globally. Since stage 1 contexts, and stage 2 contexts under SMMUv2, offer local sync operations, let's make use of those wherever we can in the hope of minimising global disruption. To that end, rather than add any more branches to the already unwieldy monolithic TLB maintenance ops, break them up into smaller, neater, functions which we can then mix and match as appropriate. Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> --- drivers/iommu/arm-smmu.c | 156 ++++++++++++++++++++++++++++++----------------- 1 file changed, 100 insertions(+), 56 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index c8aafe304171..f7411109670f 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -237,6 +237,8 @@ enum arm_smmu_s2cr_privcfg { #define ARM_SMMU_CB_S1_TLBIVAL 0x620 #define ARM_SMMU_CB_S2_TLBIIPAS2 0x630 #define ARM_SMMU_CB_S2_TLBIIPAS2L 0x638 +#define ARM_SMMU_CB_TLBSYNC 0x7f0 +#define ARM_SMMU_CB_TLBSTATUS 0x7f4 #define ARM_SMMU_CB_ATS1PR 0x800 #define ARM_SMMU_CB_ATSR 0x8f0 @@ -569,14 +571,13 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx) } /* Wait for any pending TLB invalidations to complete */ -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, + void __iomem *sync, void __iomem *status) { int count = 0; - void __iomem *gr0_base = ARM_SMMU_GR0(smmu); - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_sTLBGSYNC); - while (readl_relaxed(gr0_base + ARM_SMMU_GR0_sTLBGSTATUS) - & sTLBGSTATUS_GSACTIVE) { + writel_relaxed(0, sync); + while (readl_relaxed(status) & sTLBGSTATUS_GSACTIVE) { cpu_relax(); if (++count == TLB_LOOP_TIMEOUT) { dev_err_ratelimited(smmu->dev, @@ -587,29 +588,49 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) } } -static void arm_smmu_tlb_sync(void *cookie) +static void arm_smmu_tlb_sync_global(struct arm_smmu_device *smmu) { - struct arm_smmu_domain *smmu_domain = cookie; - __arm_smmu_tlb_sync(smmu_domain->smmu); + void __iomem *base = ARM_SMMU_GR0(smmu); + + __arm_smmu_tlb_sync(smmu, base + ARM_SMMU_GR0_sTLBGSYNC, + base + ARM_SMMU_GR0_sTLBGSTATUS); } -static void arm_smmu_tlb_inv_context(void *cookie) +static void arm_smmu_tlb_sync_context(void *cookie) +{ + struct arm_smmu_domain *smmu_domain = cookie; + struct arm_smmu_device *smmu = smmu_domain->smmu; + void __iomem *base = ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx); + + __arm_smmu_tlb_sync(smmu, base + ARM_SMMU_CB_TLBSYNC, + base + ARM_SMMU_CB_TLBSTATUS); +} + +static void arm_smmu_tlb_sync_vmid(void *cookie) +{ + struct arm_smmu_domain *smmu_domain = cookie; + + arm_smmu_tlb_sync_global(smmu_domain->smmu); +} + +static void arm_smmu_tlb_inv_context_s1(void *cookie) { struct arm_smmu_domain *smmu_domain = cookie; struct arm_smmu_cfg *cfg = &smmu_domain->cfg; + void __iomem *base = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx); + + writel_relaxed(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID); + arm_smmu_tlb_sync_context(cookie); +} + +static void arm_smmu_tlb_inv_context_s2(void *cookie) +{ + struct arm_smmu_domain *smmu_domain = cookie; struct arm_smmu_device *smmu = smmu_domain->smmu; - bool stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS; - void __iomem *base; + void __iomem *base = ARM_SMMU_GR0(smmu); - if (stage1) { - base = ARM_SMMU_CB(smmu, cfg->cbndx); - writel_relaxed(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID); - } else { - base = ARM_SMMU_GR0(smmu); - writel_relaxed(cfg->vmid, base + ARM_SMMU_GR0_TLBIVMID); - } - - __arm_smmu_tlb_sync(smmu); + writel_relaxed(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMID); + arm_smmu_tlb_sync_global(smmu); } static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, @@ -617,48 +638,66 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, { struct arm_smmu_domain *smmu_domain = cookie; struct arm_smmu_cfg *cfg = &smmu_domain->cfg; - struct arm_smmu_device *smmu = smmu_domain->smmu; bool stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS; - void __iomem *reg; + void __iomem *reg = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx); + size_t step; - if (stage1) { - reg = ARM_SMMU_CB(smmu, cfg->cbndx); - reg += leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA; - - if (cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) { - iova &= ~12UL; - iova |= cfg->asid; - do { - writel_relaxed(iova, reg); - iova += granule; - } while (size -= granule); - } else { - iova >>= 12; - iova |= (u64)cfg->asid << 48; - do { - writeq_relaxed(iova, reg); - iova += granule >> 12; - } while (size -= granule); - } - } else if (smmu->version == ARM_SMMU_V2) { - reg = ARM_SMMU_CB(smmu, cfg->cbndx); + if (stage1) + reg += leaf ? ARM_SMMU_CB_S1_TLBIVAL : + ARM_SMMU_CB_S1_TLBIVA; + else reg += leaf ? ARM_SMMU_CB_S2_TLBIIPAS2L : ARM_SMMU_CB_S2_TLBIIPAS2; - iova >>= 12; - do { - smmu_write_atomic_lq(iova, reg); - iova += granule >> 12; - } while (size -= granule); + + if (stage1 && cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) { + iova &= ~12UL; + iova |= cfg->asid; + step = granule; } else { - reg = ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_TLBIVMID; - writel_relaxed(cfg->vmid, reg); + iova >>= 12; + step = granule >> 12; + if (stage1) + iova |= (u64)cfg->asid << 48; } + + do { + smmu_write_atomic_lq(iova, reg); + iova += step; + } while (size -= granule); } -static const struct iommu_gather_ops arm_smmu_gather_ops = { - .tlb_flush_all = arm_smmu_tlb_inv_context, +/* + * On MMU-401 at least, the cost of firing off multiple TLBIVMIDs appears + * almost negligible, but the benefit of getting the first one in as far ahead + * of the sync as possible is significant, hence we don't just make this a + * no-op and set .tlb_sync to arm_smmu_inv_context_s2() as you might think. + */ +static void arm_smmu_tlb_inv_vmid_nosync(unsigned long iova, size_t size, + size_t granule, bool leaf, void *cookie) +{ + struct arm_smmu_domain *smmu_domain = cookie; + struct arm_smmu_device *smmu = smmu_domain->smmu; + void __iomem *base = ARM_SMMU_GR0(smmu); + + writel_relaxed(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMID); +} + +static const struct iommu_gather_ops arm_smmu_s1_tlb_ops = { + .tlb_flush_all = arm_smmu_tlb_inv_context_s1, .tlb_add_flush = arm_smmu_tlb_inv_range_nosync, - .tlb_sync = arm_smmu_tlb_sync, + .tlb_sync = arm_smmu_tlb_sync_context, +}; + +static const struct iommu_gather_ops arm_smmu_s2_tlb_ops_v2 = { + .tlb_flush_all = arm_smmu_tlb_inv_context_s2, + .tlb_add_flush = arm_smmu_tlb_inv_range_nosync, + .tlb_sync = arm_smmu_tlb_sync_context, +}; + +static const struct iommu_gather_ops arm_smmu_s2_tlb_ops_v1 = { + .tlb_flush_all = arm_smmu_tlb_inv_context_s2, + .tlb_add_flush = arm_smmu_tlb_inv_vmid_nosync, + .tlb_sync = arm_smmu_tlb_sync_vmid, }; static irqreturn_t arm_smmu_context_fault(int irq, void *dev) @@ -829,6 +868,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain, enum io_pgtable_fmt fmt; struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_cfg *cfg = &smmu_domain->cfg; + const struct iommu_gather_ops *tlb_ops; mutex_lock(&smmu_domain->init_mutex); if (smmu_domain->smmu) @@ -900,6 +940,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain, ias = min(ias, 32UL); oas = min(oas, 32UL); } + tlb_ops = &arm_smmu_s1_tlb_ops; break; case ARM_SMMU_DOMAIN_NESTED: /* @@ -918,12 +959,15 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain, ias = min(ias, 40UL); oas = min(oas, 40UL); } + if (smmu->version == ARM_SMMU_V2) + tlb_ops = &arm_smmu_s2_tlb_ops_v2; + else + tlb_ops = &arm_smmu_s2_tlb_ops_v1; break; default: ret = -EINVAL; goto out_unlock; } - ret = __arm_smmu_alloc_bitmap(smmu->context_map, start, smmu->num_context_banks); if (ret < 0) @@ -946,7 +990,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain, .pgsize_bitmap = smmu->pgsize_bitmap, .ias = ias, .oas = oas, - .tlb = &arm_smmu_gather_ops, + .tlb = tlb_ops, .iommu_dev = smmu->dev, }; @@ -1730,7 +1774,7 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) reg |= sCR0_EXIDENABLE; /* Push the button */ - __arm_smmu_tlb_sync(smmu); + arm_smmu_tlb_sync_global(smmu); writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); } -- 2.11.0.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate @ 2017-03-07 18:09 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2017-03-07 18:09 UTC (permalink / raw) To: linux-arm-kernel TLB synchronisation typically involves the SMMU blocking all incoming transactions until the TLBs report completion of all outstanding operations. In the common SMMUv2 configuration of a single distributed SMMU serving multiple peripherals, that means that a single unmap request has the potential to bring the hammer down on the entire system if synchronised globally. Since stage 1 contexts, and stage 2 contexts under SMMUv2, offer local sync operations, let's make use of those wherever we can in the hope of minimising global disruption. To that end, rather than add any more branches to the already unwieldy monolithic TLB maintenance ops, break them up into smaller, neater, functions which we can then mix and match as appropriate. Signed-off-by: Robin Murphy <robin.murphy@arm.com> --- drivers/iommu/arm-smmu.c | 156 ++++++++++++++++++++++++++++++----------------- 1 file changed, 100 insertions(+), 56 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index c8aafe304171..f7411109670f 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -237,6 +237,8 @@ enum arm_smmu_s2cr_privcfg { #define ARM_SMMU_CB_S1_TLBIVAL 0x620 #define ARM_SMMU_CB_S2_TLBIIPAS2 0x630 #define ARM_SMMU_CB_S2_TLBIIPAS2L 0x638 +#define ARM_SMMU_CB_TLBSYNC 0x7f0 +#define ARM_SMMU_CB_TLBSTATUS 0x7f4 #define ARM_SMMU_CB_ATS1PR 0x800 #define ARM_SMMU_CB_ATSR 0x8f0 @@ -569,14 +571,13 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx) } /* Wait for any pending TLB invalidations to complete */ -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, + void __iomem *sync, void __iomem *status) { int count = 0; - void __iomem *gr0_base = ARM_SMMU_GR0(smmu); - writel_relaxed(0, gr0_base + ARM_SMMU_GR0_sTLBGSYNC); - while (readl_relaxed(gr0_base + ARM_SMMU_GR0_sTLBGSTATUS) - & sTLBGSTATUS_GSACTIVE) { + writel_relaxed(0, sync); + while (readl_relaxed(status) & sTLBGSTATUS_GSACTIVE) { cpu_relax(); if (++count == TLB_LOOP_TIMEOUT) { dev_err_ratelimited(smmu->dev, @@ -587,29 +588,49 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) } } -static void arm_smmu_tlb_sync(void *cookie) +static void arm_smmu_tlb_sync_global(struct arm_smmu_device *smmu) { - struct arm_smmu_domain *smmu_domain = cookie; - __arm_smmu_tlb_sync(smmu_domain->smmu); + void __iomem *base = ARM_SMMU_GR0(smmu); + + __arm_smmu_tlb_sync(smmu, base + ARM_SMMU_GR0_sTLBGSYNC, + base + ARM_SMMU_GR0_sTLBGSTATUS); } -static void arm_smmu_tlb_inv_context(void *cookie) +static void arm_smmu_tlb_sync_context(void *cookie) +{ + struct arm_smmu_domain *smmu_domain = cookie; + struct arm_smmu_device *smmu = smmu_domain->smmu; + void __iomem *base = ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx); + + __arm_smmu_tlb_sync(smmu, base + ARM_SMMU_CB_TLBSYNC, + base + ARM_SMMU_CB_TLBSTATUS); +} + +static void arm_smmu_tlb_sync_vmid(void *cookie) +{ + struct arm_smmu_domain *smmu_domain = cookie; + + arm_smmu_tlb_sync_global(smmu_domain->smmu); +} + +static void arm_smmu_tlb_inv_context_s1(void *cookie) { struct arm_smmu_domain *smmu_domain = cookie; struct arm_smmu_cfg *cfg = &smmu_domain->cfg; + void __iomem *base = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx); + + writel_relaxed(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID); + arm_smmu_tlb_sync_context(cookie); +} + +static void arm_smmu_tlb_inv_context_s2(void *cookie) +{ + struct arm_smmu_domain *smmu_domain = cookie; struct arm_smmu_device *smmu = smmu_domain->smmu; - bool stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS; - void __iomem *base; + void __iomem *base = ARM_SMMU_GR0(smmu); - if (stage1) { - base = ARM_SMMU_CB(smmu, cfg->cbndx); - writel_relaxed(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID); - } else { - base = ARM_SMMU_GR0(smmu); - writel_relaxed(cfg->vmid, base + ARM_SMMU_GR0_TLBIVMID); - } - - __arm_smmu_tlb_sync(smmu); + writel_relaxed(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMID); + arm_smmu_tlb_sync_global(smmu); } static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, @@ -617,48 +638,66 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, { struct arm_smmu_domain *smmu_domain = cookie; struct arm_smmu_cfg *cfg = &smmu_domain->cfg; - struct arm_smmu_device *smmu = smmu_domain->smmu; bool stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS; - void __iomem *reg; + void __iomem *reg = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx); + size_t step; - if (stage1) { - reg = ARM_SMMU_CB(smmu, cfg->cbndx); - reg += leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA; - - if (cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) { - iova &= ~12UL; - iova |= cfg->asid; - do { - writel_relaxed(iova, reg); - iova += granule; - } while (size -= granule); - } else { - iova >>= 12; - iova |= (u64)cfg->asid << 48; - do { - writeq_relaxed(iova, reg); - iova += granule >> 12; - } while (size -= granule); - } - } else if (smmu->version == ARM_SMMU_V2) { - reg = ARM_SMMU_CB(smmu, cfg->cbndx); + if (stage1) + reg += leaf ? ARM_SMMU_CB_S1_TLBIVAL : + ARM_SMMU_CB_S1_TLBIVA; + else reg += leaf ? ARM_SMMU_CB_S2_TLBIIPAS2L : ARM_SMMU_CB_S2_TLBIIPAS2; - iova >>= 12; - do { - smmu_write_atomic_lq(iova, reg); - iova += granule >> 12; - } while (size -= granule); + + if (stage1 && cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) { + iova &= ~12UL; + iova |= cfg->asid; + step = granule; } else { - reg = ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_TLBIVMID; - writel_relaxed(cfg->vmid, reg); + iova >>= 12; + step = granule >> 12; + if (stage1) + iova |= (u64)cfg->asid << 48; } + + do { + smmu_write_atomic_lq(iova, reg); + iova += step; + } while (size -= granule); } -static const struct iommu_gather_ops arm_smmu_gather_ops = { - .tlb_flush_all = arm_smmu_tlb_inv_context, +/* + * On MMU-401@least, the cost of firing off multiple TLBIVMIDs appears + * almost negligible, but the benefit of getting the first one in as far ahead + * of the sync as possible is significant, hence we don't just make this a + * no-op and set .tlb_sync to arm_smmu_inv_context_s2() as you might think. + */ +static void arm_smmu_tlb_inv_vmid_nosync(unsigned long iova, size_t size, + size_t granule, bool leaf, void *cookie) +{ + struct arm_smmu_domain *smmu_domain = cookie; + struct arm_smmu_device *smmu = smmu_domain->smmu; + void __iomem *base = ARM_SMMU_GR0(smmu); + + writel_relaxed(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMID); +} + +static const struct iommu_gather_ops arm_smmu_s1_tlb_ops = { + .tlb_flush_all = arm_smmu_tlb_inv_context_s1, .tlb_add_flush = arm_smmu_tlb_inv_range_nosync, - .tlb_sync = arm_smmu_tlb_sync, + .tlb_sync = arm_smmu_tlb_sync_context, +}; + +static const struct iommu_gather_ops arm_smmu_s2_tlb_ops_v2 = { + .tlb_flush_all = arm_smmu_tlb_inv_context_s2, + .tlb_add_flush = arm_smmu_tlb_inv_range_nosync, + .tlb_sync = arm_smmu_tlb_sync_context, +}; + +static const struct iommu_gather_ops arm_smmu_s2_tlb_ops_v1 = { + .tlb_flush_all = arm_smmu_tlb_inv_context_s2, + .tlb_add_flush = arm_smmu_tlb_inv_vmid_nosync, + .tlb_sync = arm_smmu_tlb_sync_vmid, }; static irqreturn_t arm_smmu_context_fault(int irq, void *dev) @@ -829,6 +868,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain, enum io_pgtable_fmt fmt; struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_cfg *cfg = &smmu_domain->cfg; + const struct iommu_gather_ops *tlb_ops; mutex_lock(&smmu_domain->init_mutex); if (smmu_domain->smmu) @@ -900,6 +940,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain, ias = min(ias, 32UL); oas = min(oas, 32UL); } + tlb_ops = &arm_smmu_s1_tlb_ops; break; case ARM_SMMU_DOMAIN_NESTED: /* @@ -918,12 +959,15 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain, ias = min(ias, 40UL); oas = min(oas, 40UL); } + if (smmu->version == ARM_SMMU_V2) + tlb_ops = &arm_smmu_s2_tlb_ops_v2; + else + tlb_ops = &arm_smmu_s2_tlb_ops_v1; break; default: ret = -EINVAL; goto out_unlock; } - ret = __arm_smmu_alloc_bitmap(smmu->context_map, start, smmu->num_context_banks); if (ret < 0) @@ -946,7 +990,7 @@ static int arm_smmu_init_domain_context(struct iommu_domain *domain, .pgsize_bitmap = smmu->pgsize_bitmap, .ias = ias, .oas = oas, - .tlb = &arm_smmu_gather_ops, + .tlb = tlb_ops, .iommu_dev = smmu->dev, }; @@ -1730,7 +1774,7 @@ static void arm_smmu_device_reset(struct arm_smmu_device *smmu) reg |= sCR0_EXIDENABLE; /* Push the button */ - __arm_smmu_tlb_sync(smmu); + arm_smmu_tlb_sync_global(smmu); writel(reg, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); } -- 2.11.0.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
[parent not found: <03ef837586cd991f2d8431908230fd3a3dd8c9cf.1488907474.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>]
* Re: [PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate 2017-03-07 18:09 ` Robin Murphy @ 2017-03-30 14:37 ` Will Deacon -1 siblings, 0 replies; 32+ messages in thread From: Will Deacon @ 2017-03-30 14:37 UTC (permalink / raw) To: Robin Murphy Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r Hi Robin, This mostly looks great, but I have a couple of minor comments below. On Tue, Mar 07, 2017 at 06:09:07PM +0000, Robin Murphy wrote: > TLB synchronisation typically involves the SMMU blocking all incoming > transactions until the TLBs report completion of all outstanding > operations. In the common SMMUv2 configuration of a single distributed > SMMU serving multiple peripherals, that means that a single unmap > request has the potential to bring the hammer down on the entire system > if synchronised globally. Since stage 1 contexts, and stage 2 contexts > under SMMUv2, offer local sync operations, let's make use of those > wherever we can in the hope of minimising global disruption. > > To that end, rather than add any more branches to the already unwieldy > monolithic TLB maintenance ops, break them up into smaller, neater, > functions which we can then mix and match as appropriate. > > Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> > --- > drivers/iommu/arm-smmu.c | 156 ++++++++++++++++++++++++++++++----------------- > 1 file changed, 100 insertions(+), 56 deletions(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index c8aafe304171..f7411109670f 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -237,6 +237,8 @@ enum arm_smmu_s2cr_privcfg { > #define ARM_SMMU_CB_S1_TLBIVAL 0x620 > #define ARM_SMMU_CB_S2_TLBIIPAS2 0x630 > #define ARM_SMMU_CB_S2_TLBIIPAS2L 0x638 > +#define ARM_SMMU_CB_TLBSYNC 0x7f0 > +#define ARM_SMMU_CB_TLBSTATUS 0x7f4 > #define ARM_SMMU_CB_ATS1PR 0x800 > #define ARM_SMMU_CB_ATSR 0x8f0 > > @@ -569,14 +571,13 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx) > } > > /* Wait for any pending TLB invalidations to complete */ > -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) > +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, > + void __iomem *sync, void __iomem *status) Given that you take the arm_smmu_device anyway, I think I'd prefer just passing the offsets for sync and status and avoiding the additions in the caller (a bit like your other patch in this series ;). > static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, > @@ -617,48 +638,66 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, > { > struct arm_smmu_domain *smmu_domain = cookie; > struct arm_smmu_cfg *cfg = &smmu_domain->cfg; > - struct arm_smmu_device *smmu = smmu_domain->smmu; > bool stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS; > - void __iomem *reg; > + void __iomem *reg = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx); > + size_t step; > > - if (stage1) { > - reg = ARM_SMMU_CB(smmu, cfg->cbndx); > - reg += leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA; > - > - if (cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) { > - iova &= ~12UL; > - iova |= cfg->asid; > - do { > - writel_relaxed(iova, reg); > - iova += granule; > - } while (size -= granule); > - } else { > - iova >>= 12; > - iova |= (u64)cfg->asid << 48; > - do { > - writeq_relaxed(iova, reg); > - iova += granule >> 12; > - } while (size -= granule); > - } > - } else if (smmu->version == ARM_SMMU_V2) { > - reg = ARM_SMMU_CB(smmu, cfg->cbndx); > + if (stage1) > + reg += leaf ? ARM_SMMU_CB_S1_TLBIVAL : > + ARM_SMMU_CB_S1_TLBIVA; > + else > reg += leaf ? ARM_SMMU_CB_S2_TLBIIPAS2L : > ARM_SMMU_CB_S2_TLBIIPAS2; > - iova >>= 12; > - do { > - smmu_write_atomic_lq(iova, reg); > - iova += granule >> 12; > - } while (size -= granule); > + > + if (stage1 && cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) { > + iova &= ~12UL; > + iova |= cfg->asid; > + step = granule; > } else { > - reg = ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_TLBIVMID; > - writel_relaxed(cfg->vmid, reg); > + iova >>= 12; > + step = granule >> 12; > + if (stage1) > + iova |= (u64)cfg->asid << 48; > } > + > + do { > + smmu_write_atomic_lq(iova, reg); > + iova += step; > + } while (size -= granule); There seems to be a lot of refactoring going on here, but I'm not entirely comfortable with the unconditional move to smmu_write_atomic_lq. Given the way in which arm_smmu_tlb_inv_range_nosync is now called (i.e. only for stage-1 or SMMUv2 stage-2), then I think you just need to delete the final else clause in the current code and you're done. Will ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate @ 2017-03-30 14:37 ` Will Deacon 0 siblings, 0 replies; 32+ messages in thread From: Will Deacon @ 2017-03-30 14:37 UTC (permalink / raw) To: linux-arm-kernel Hi Robin, This mostly looks great, but I have a couple of minor comments below. On Tue, Mar 07, 2017 at 06:09:07PM +0000, Robin Murphy wrote: > TLB synchronisation typically involves the SMMU blocking all incoming > transactions until the TLBs report completion of all outstanding > operations. In the common SMMUv2 configuration of a single distributed > SMMU serving multiple peripherals, that means that a single unmap > request has the potential to bring the hammer down on the entire system > if synchronised globally. Since stage 1 contexts, and stage 2 contexts > under SMMUv2, offer local sync operations, let's make use of those > wherever we can in the hope of minimising global disruption. > > To that end, rather than add any more branches to the already unwieldy > monolithic TLB maintenance ops, break them up into smaller, neater, > functions which we can then mix and match as appropriate. > > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > --- > drivers/iommu/arm-smmu.c | 156 ++++++++++++++++++++++++++++++----------------- > 1 file changed, 100 insertions(+), 56 deletions(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index c8aafe304171..f7411109670f 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -237,6 +237,8 @@ enum arm_smmu_s2cr_privcfg { > #define ARM_SMMU_CB_S1_TLBIVAL 0x620 > #define ARM_SMMU_CB_S2_TLBIIPAS2 0x630 > #define ARM_SMMU_CB_S2_TLBIIPAS2L 0x638 > +#define ARM_SMMU_CB_TLBSYNC 0x7f0 > +#define ARM_SMMU_CB_TLBSTATUS 0x7f4 > #define ARM_SMMU_CB_ATS1PR 0x800 > #define ARM_SMMU_CB_ATSR 0x8f0 > > @@ -569,14 +571,13 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx) > } > > /* Wait for any pending TLB invalidations to complete */ > -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) > +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, > + void __iomem *sync, void __iomem *status) Given that you take the arm_smmu_device anyway, I think I'd prefer just passing the offsets for sync and status and avoiding the additions in the caller (a bit like your other patch in this series ;). > static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, > @@ -617,48 +638,66 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, > { > struct arm_smmu_domain *smmu_domain = cookie; > struct arm_smmu_cfg *cfg = &smmu_domain->cfg; > - struct arm_smmu_device *smmu = smmu_domain->smmu; > bool stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS; > - void __iomem *reg; > + void __iomem *reg = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx); > + size_t step; > > - if (stage1) { > - reg = ARM_SMMU_CB(smmu, cfg->cbndx); > - reg += leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA; > - > - if (cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) { > - iova &= ~12UL; > - iova |= cfg->asid; > - do { > - writel_relaxed(iova, reg); > - iova += granule; > - } while (size -= granule); > - } else { > - iova >>= 12; > - iova |= (u64)cfg->asid << 48; > - do { > - writeq_relaxed(iova, reg); > - iova += granule >> 12; > - } while (size -= granule); > - } > - } else if (smmu->version == ARM_SMMU_V2) { > - reg = ARM_SMMU_CB(smmu, cfg->cbndx); > + if (stage1) > + reg += leaf ? ARM_SMMU_CB_S1_TLBIVAL : > + ARM_SMMU_CB_S1_TLBIVA; > + else > reg += leaf ? ARM_SMMU_CB_S2_TLBIIPAS2L : > ARM_SMMU_CB_S2_TLBIIPAS2; > - iova >>= 12; > - do { > - smmu_write_atomic_lq(iova, reg); > - iova += granule >> 12; > - } while (size -= granule); > + > + if (stage1 && cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) { > + iova &= ~12UL; > + iova |= cfg->asid; > + step = granule; > } else { > - reg = ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_TLBIVMID; > - writel_relaxed(cfg->vmid, reg); > + iova >>= 12; > + step = granule >> 12; > + if (stage1) > + iova |= (u64)cfg->asid << 48; > } > + > + do { > + smmu_write_atomic_lq(iova, reg); > + iova += step; > + } while (size -= granule); There seems to be a lot of refactoring going on here, but I'm not entirely comfortable with the unconditional move to smmu_write_atomic_lq. Given the way in which arm_smmu_tlb_inv_range_nosync is now called (i.e. only for stage-1 or SMMUv2 stage-2), then I think you just need to delete the final else clause in the current code and you're done. Will ^ permalink raw reply [flat|nested] 32+ messages in thread
[parent not found: <20170330143744.GG22160-5wv7dgnIgG8@public.gmane.org>]
* Re: [PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate 2017-03-30 14:37 ` Will Deacon @ 2017-03-30 15:48 ` Robin Murphy -1 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2017-03-30 15:48 UTC (permalink / raw) To: Will Deacon Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r On 30/03/17 15:37, Will Deacon wrote: > Hi Robin, > > This mostly looks great, but I have a couple of minor comments below. > > On Tue, Mar 07, 2017 at 06:09:07PM +0000, Robin Murphy wrote: >> TLB synchronisation typically involves the SMMU blocking all incoming >> transactions until the TLBs report completion of all outstanding >> operations. In the common SMMUv2 configuration of a single distributed >> SMMU serving multiple peripherals, that means that a single unmap >> request has the potential to bring the hammer down on the entire system >> if synchronised globally. Since stage 1 contexts, and stage 2 contexts >> under SMMUv2, offer local sync operations, let's make use of those >> wherever we can in the hope of minimising global disruption. >> >> To that end, rather than add any more branches to the already unwieldy >> monolithic TLB maintenance ops, break them up into smaller, neater, >> functions which we can then mix and match as appropriate. >> >> Signed-off-by: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> >> --- >> drivers/iommu/arm-smmu.c | 156 ++++++++++++++++++++++++++++++----------------- >> 1 file changed, 100 insertions(+), 56 deletions(-) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index c8aafe304171..f7411109670f 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -237,6 +237,8 @@ enum arm_smmu_s2cr_privcfg { >> #define ARM_SMMU_CB_S1_TLBIVAL 0x620 >> #define ARM_SMMU_CB_S2_TLBIIPAS2 0x630 >> #define ARM_SMMU_CB_S2_TLBIIPAS2L 0x638 >> +#define ARM_SMMU_CB_TLBSYNC 0x7f0 >> +#define ARM_SMMU_CB_TLBSTATUS 0x7f4 >> #define ARM_SMMU_CB_ATS1PR 0x800 >> #define ARM_SMMU_CB_ATSR 0x8f0 >> >> @@ -569,14 +571,13 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx) >> } >> >> /* Wait for any pending TLB invalidations to complete */ >> -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) >> +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, >> + void __iomem *sync, void __iomem *status) > > Given that you take the arm_smmu_device anyway, I think I'd prefer just > passing the offsets for sync and status and avoiding the additions > in the caller (a bit like your other patch in this series ;). Note that the sole reason for passing the arm_smmu_device is for the dev_err(), but I neither want to remove that nor duplicate it across the callers... However, the concrete reason for not passing offsets is that this function serves for both global and local syncs, so there is no single base address that can be assumed. At one point I toyed with just passing a context bank number (using -1 for "global") but even I thought that ended up looking awful ;) >> static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, >> @@ -617,48 +638,66 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, >> { >> struct arm_smmu_domain *smmu_domain = cookie; >> struct arm_smmu_cfg *cfg = &smmu_domain->cfg; >> - struct arm_smmu_device *smmu = smmu_domain->smmu; >> bool stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS; >> - void __iomem *reg; >> + void __iomem *reg = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx); >> + size_t step; >> >> - if (stage1) { >> - reg = ARM_SMMU_CB(smmu, cfg->cbndx); >> - reg += leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA; >> - >> - if (cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) { >> - iova &= ~12UL; >> - iova |= cfg->asid; >> - do { >> - writel_relaxed(iova, reg); >> - iova += granule; >> - } while (size -= granule); >> - } else { >> - iova >>= 12; >> - iova |= (u64)cfg->asid << 48; >> - do { >> - writeq_relaxed(iova, reg); >> - iova += granule >> 12; >> - } while (size -= granule); >> - } >> - } else if (smmu->version == ARM_SMMU_V2) { >> - reg = ARM_SMMU_CB(smmu, cfg->cbndx); >> + if (stage1) >> + reg += leaf ? ARM_SMMU_CB_S1_TLBIVAL : >> + ARM_SMMU_CB_S1_TLBIVA; >> + else >> reg += leaf ? ARM_SMMU_CB_S2_TLBIIPAS2L : >> ARM_SMMU_CB_S2_TLBIIPAS2; >> - iova >>= 12; >> - do { >> - smmu_write_atomic_lq(iova, reg); >> - iova += granule >> 12; >> - } while (size -= granule); >> + >> + if (stage1 && cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) { >> + iova &= ~12UL; >> + iova |= cfg->asid; >> + step = granule; >> } else { >> - reg = ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_TLBIVMID; >> - writel_relaxed(cfg->vmid, reg); >> + iova >>= 12; >> + step = granule >> 12; >> + if (stage1) >> + iova |= (u64)cfg->asid << 48; >> } >> + >> + do { >> + smmu_write_atomic_lq(iova, reg); >> + iova += step; >> + } while (size -= granule); > > There seems to be a lot of refactoring going on here, but I'm not entirely > comfortable with the unconditional move to smmu_write_atomic_lq. Given the > way in which arm_smmu_tlb_inv_range_nosync is now called (i.e. only for > stage-1 or SMMUv2 stage-2), then I think you just need to delete the final > else clause in the current code and you're done. Yes, this bears the scars of having been split into multiple functions then put back together again, possibly more than once (I forget the full history). Now that things have settled on just 3 sets of ops (stage 1, stage 2, and braindead stage 2), I'll have one last try at a stage 1/stage 2 split, just to try and make the logic less inscrutable (which is my main gripe with this function), but I'll do so as a standalone patch that you can have the final call on. Robin. > > Will > ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate @ 2017-03-30 15:48 ` Robin Murphy 0 siblings, 0 replies; 32+ messages in thread From: Robin Murphy @ 2017-03-30 15:48 UTC (permalink / raw) To: linux-arm-kernel On 30/03/17 15:37, Will Deacon wrote: > Hi Robin, > > This mostly looks great, but I have a couple of minor comments below. > > On Tue, Mar 07, 2017 at 06:09:07PM +0000, Robin Murphy wrote: >> TLB synchronisation typically involves the SMMU blocking all incoming >> transactions until the TLBs report completion of all outstanding >> operations. In the common SMMUv2 configuration of a single distributed >> SMMU serving multiple peripherals, that means that a single unmap >> request has the potential to bring the hammer down on the entire system >> if synchronised globally. Since stage 1 contexts, and stage 2 contexts >> under SMMUv2, offer local sync operations, let's make use of those >> wherever we can in the hope of minimising global disruption. >> >> To that end, rather than add any more branches to the already unwieldy >> monolithic TLB maintenance ops, break them up into smaller, neater, >> functions which we can then mix and match as appropriate. >> >> Signed-off-by: Robin Murphy <robin.murphy@arm.com> >> --- >> drivers/iommu/arm-smmu.c | 156 ++++++++++++++++++++++++++++++----------------- >> 1 file changed, 100 insertions(+), 56 deletions(-) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index c8aafe304171..f7411109670f 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -237,6 +237,8 @@ enum arm_smmu_s2cr_privcfg { >> #define ARM_SMMU_CB_S1_TLBIVAL 0x620 >> #define ARM_SMMU_CB_S2_TLBIIPAS2 0x630 >> #define ARM_SMMU_CB_S2_TLBIIPAS2L 0x638 >> +#define ARM_SMMU_CB_TLBSYNC 0x7f0 >> +#define ARM_SMMU_CB_TLBSTATUS 0x7f4 >> #define ARM_SMMU_CB_ATS1PR 0x800 >> #define ARM_SMMU_CB_ATSR 0x8f0 >> >> @@ -569,14 +571,13 @@ static void __arm_smmu_free_bitmap(unsigned long *map, int idx) >> } >> >> /* Wait for any pending TLB invalidations to complete */ >> -static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu) >> +static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, >> + void __iomem *sync, void __iomem *status) > > Given that you take the arm_smmu_device anyway, I think I'd prefer just > passing the offsets for sync and status and avoiding the additions > in the caller (a bit like your other patch in this series ;). Note that the sole reason for passing the arm_smmu_device is for the dev_err(), but I neither want to remove that nor duplicate it across the callers... However, the concrete reason for not passing offsets is that this function serves for both global and local syncs, so there is no single base address that can be assumed. At one point I toyed with just passing a context bank number (using -1 for "global") but even I thought that ended up looking awful ;) >> static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, >> @@ -617,48 +638,66 @@ static void arm_smmu_tlb_inv_range_nosync(unsigned long iova, size_t size, >> { >> struct arm_smmu_domain *smmu_domain = cookie; >> struct arm_smmu_cfg *cfg = &smmu_domain->cfg; >> - struct arm_smmu_device *smmu = smmu_domain->smmu; >> bool stage1 = cfg->cbar != CBAR_TYPE_S2_TRANS; >> - void __iomem *reg; >> + void __iomem *reg = ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx); >> + size_t step; >> >> - if (stage1) { >> - reg = ARM_SMMU_CB(smmu, cfg->cbndx); >> - reg += leaf ? ARM_SMMU_CB_S1_TLBIVAL : ARM_SMMU_CB_S1_TLBIVA; >> - >> - if (cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) { >> - iova &= ~12UL; >> - iova |= cfg->asid; >> - do { >> - writel_relaxed(iova, reg); >> - iova += granule; >> - } while (size -= granule); >> - } else { >> - iova >>= 12; >> - iova |= (u64)cfg->asid << 48; >> - do { >> - writeq_relaxed(iova, reg); >> - iova += granule >> 12; >> - } while (size -= granule); >> - } >> - } else if (smmu->version == ARM_SMMU_V2) { >> - reg = ARM_SMMU_CB(smmu, cfg->cbndx); >> + if (stage1) >> + reg += leaf ? ARM_SMMU_CB_S1_TLBIVAL : >> + ARM_SMMU_CB_S1_TLBIVA; >> + else >> reg += leaf ? ARM_SMMU_CB_S2_TLBIIPAS2L : >> ARM_SMMU_CB_S2_TLBIIPAS2; >> - iova >>= 12; >> - do { >> - smmu_write_atomic_lq(iova, reg); >> - iova += granule >> 12; >> - } while (size -= granule); >> + >> + if (stage1 && cfg->fmt != ARM_SMMU_CTX_FMT_AARCH64) { >> + iova &= ~12UL; >> + iova |= cfg->asid; >> + step = granule; >> } else { >> - reg = ARM_SMMU_GR0(smmu) + ARM_SMMU_GR0_TLBIVMID; >> - writel_relaxed(cfg->vmid, reg); >> + iova >>= 12; >> + step = granule >> 12; >> + if (stage1) >> + iova |= (u64)cfg->asid << 48; >> } >> + >> + do { >> + smmu_write_atomic_lq(iova, reg); >> + iova += step; >> + } while (size -= granule); > > There seems to be a lot of refactoring going on here, but I'm not entirely > comfortable with the unconditional move to smmu_write_atomic_lq. Given the > way in which arm_smmu_tlb_inv_range_nosync is now called (i.e. only for > stage-1 or SMMUv2 stage-2), then I think you just need to delete the final > else clause in the current code and you're done. Yes, this bears the scars of having been split into multiple functions then put back together again, possibly more than once (I forget the full history). Now that things have settled on just 3 sets of ops (stage 1, stage 2, and braindead stage 2), I'll have one last try at a stage 1/stage 2 split, just to try and make the logic less inscrutable (which is my main gripe with this function), but I'll do so as a standalone patch that you can have the final call on. Robin. > > Will > ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2017-03-30 15:48 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-01-26 18:06 [PATCH 0/4] Miscellaneous ARM SMMU patches Robin Murphy 2016-01-26 18:06 ` Robin Murphy [not found] ` <cover.1453830752.git.robin.murphy-5wv7dgnIgG8@public.gmane.org> 2016-01-26 18:06 ` [PATCH 1/4] iommu/arm-smmu: Treat all device transactions as unprivileged Robin Murphy 2016-01-26 18:06 ` Robin Murphy [not found] ` <6c5730256333b8d941f2c0371c1ab709a454938c.1453830752.git.robin.murphy-5wv7dgnIgG8@public.gmane.org> 2016-01-27 6:00 ` Anup Patel 2016-01-27 6:00 ` Anup Patel 2016-02-09 14:08 ` Will Deacon 2016-02-09 14:08 ` Will Deacon 2016-01-26 18:06 ` [PATCH 2/4] iommu/arm-smmu: Allow disabling unmatched stream bypass Robin Murphy 2016-01-26 18:06 ` Robin Murphy [not found] ` <10ebf5a136a787da54ffd1d6167953f82f01a834.1453830752.git.robin.murphy-5wv7dgnIgG8@public.gmane.org> 2016-02-09 14:06 ` Will Deacon 2016-02-09 14:06 ` Will Deacon [not found] ` <20160209140631.GM22874-5wv7dgnIgG8@public.gmane.org> 2016-02-10 12:10 ` Robin Murphy 2016-02-10 12:10 ` Robin Murphy 2016-02-10 14:25 ` [PATCH v2] " Robin Murphy 2016-02-10 14:25 ` Robin Murphy 2016-01-26 18:06 ` [PATCH 3/4] iommu/arm-smmu: Support DMA-API domains Robin Murphy 2016-01-26 18:06 ` Robin Murphy 2016-01-26 18:06 ` [PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate Robin Murphy 2016-01-26 18:06 ` Robin Murphy [not found] ` <fc2ede950aea2e84a0b7cf93e4cd605f0cbdf3c3.1453830752.git.robin.murphy-5wv7dgnIgG8@public.gmane.org> 2016-02-09 14:15 ` Will Deacon 2016-02-09 14:15 ` Will Deacon [not found] ` <20160209141508.GO22874-5wv7dgnIgG8@public.gmane.org> 2016-02-10 11:58 ` Robin Murphy 2016-02-10 11:58 ` Robin Murphy 2016-02-09 14:16 ` [PATCH 0/4] Miscellaneous ARM SMMU patches Will Deacon 2016-02-09 14:16 ` Will Deacon 2017-03-07 18:09 [PATCH 0/4] ARM SMMU per-context TLB sync Robin Murphy [not found] ` <cover.1488907474.git.robin.murphy-5wv7dgnIgG8@public.gmane.org> 2017-03-07 18:09 ` [PATCH 4/4] iommu/arm-smmu: Use per-context TLB sync as appropriate Robin Murphy 2017-03-07 18:09 ` Robin Murphy [not found] ` <03ef837586cd991f2d8431908230fd3a3dd8c9cf.1488907474.git.robin.murphy-5wv7dgnIgG8@public.gmane.org> 2017-03-30 14:37 ` Will Deacon 2017-03-30 14:37 ` Will Deacon [not found] ` <20170330143744.GG22160-5wv7dgnIgG8@public.gmane.org> 2017-03-30 15:48 ` Robin Murphy 2017-03-30 15:48 ` Robin Murphy
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.