* [PATCH v7 0/6] iommu/arm-smmu: Add runtime pm/sleep support @ 2018-02-07 10:31 Vivek Gautam [not found] ` <1517999482-17317-1-git-send-email-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Vivek Gautam @ 2018-02-07 10:31 UTC (permalink / raw) To: joro-zLv9SwRftAIdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8, rjw-LthD3rsA81gm4RdzfppkhA, robin.murphy-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8, robdclark-Re5JQEeQqe8AvxtiuMwx3w, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: airlied-cv59FeDIM0c, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, sboyd-sgV2jX0FEOL9JmXXK+q4OQ, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA This series provides the support for turning on the arm-smmu's clocks/power domains using runtime pm. This is done using the recently introduced device links patches, which lets the smmu's runtime to follow the master's runtime pm, so the smmu remains powered only when the masters use it. It also adds support for Qcom's arm-smmu-v2 variant that has different clocks and power requirements. Took some reference from the exynos runtime patches [1]. After much discussion [3] over the use of pm_runtime_get/put() in .unmap op path for the arm-smmu, and after disussing over more than a couple of approaches to address this, we are putting forward the changes *without* using pm_runtime APIs in 'unmap'. Rather, letting the client device take the control of powering on/off the connected iommu through pm_runtime_get(put)_suppliers() APIs for the scnerios when the iommu power can't be directly controlled by clients through device links. Rafael acked the change to export the suppliers APIs. Previous version of this patch series is @ [5]. [v7] * Addressed review comments given by Robin Murphy - -- Added device_link_del() in .remove_device path. -- Error path cleanup in arm_smmu_add_device(). -- Added pm_runtime_get/put_sync() in .remove path, and replaced pm_runtime_force_suspend() with pm_runtime_disable(). -- clk_names cleanup in arm_smmu_init_clks() * Added 'Reviewed-by' given by Rob H. [V6] * Added Ack given by Rafael to first patch in the series. * Addressed Rob Herring's comment for adding soc specific compatible string as well besides 'qcom,smmu-v2'. [V5] * Dropped runtime pm calls from "arm_smmu_unmap" op as discussed over the list [3] for the last patch series. * Added a patch to export pm_runtime_get/put_suppliers() APIs to the series as agreed with Rafael [4]. * Added the related patch for msm drm iommu layer to use pm_runtime_get/put_suppliers() APIs in msm_mmu_funcs. * Dropped arm-mmu500 clock patch since that would break existing platforms. * Changed compatible 'qcom,msm8996-smmu-v2' to 'qcom,smmu-v2' to reflect the IP version rather than the platform on which it is used. The same IP is used across multiple platforms including msm8996, and sdm845 etc. * Using clock bulk APIs to handle the clocks available to the IP as suggested by Stephen Boyd. * The first patch in v4 version of the patch-series: ("iommu/arm-smmu: Fix the error path in arm_smmu_add_device") has already made it to mainline. [V4] * Reworked the clock handling part. We now take clock names as data in the driver for supported compatible versions, and loop over them to get, enable, and disable the clocks. * Using qcom,msm8996 based compatibles for bindings instead of a generic qcom compatible. * Refactor MMU500 patch to just add the necessary clock names data and corresponding bindings. * Added the pm_runtime_get/put() calls in .unmap iommu op (fix added by Stanimir on top of previous patch version. * Added a patch to fix error path in arm_smmu_add_device() * Removed patch 3/5 of V3 patch series that added qcom,smmu-v2 bindings. [V3] * Reworked the patches to keep the clocks init/enabling function separately for each compatible. * Added clocks bindings for MMU40x/500. * Added a new compatible for qcom,smmu-v2 implementation and the clock bindings for the same. * Rebased on top of 4.11-rc1 [V2] * Split the patches little differently. * Addressed comments. * Removed the patch #4 [2] from previous post for arm-smmu context save restore. Planning to post this separately after reworking/addressing Robin's feedback. * Reversed the sequence to disable clocks than enabling. This was required for those cases where the clocks are populated in a dependent order from DT. [1] https://lkml.org/lkml/2016/10/20/70 [2] https://patchwork.kernel.org/patch/9389717/ [3] https://patchwork.kernel.org/patch/9827825/ [4] https://patchwork.kernel.org/patch/10102445/ [5] https://lkml.org/lkml/2018/1/19/217 Sricharan R (3): iommu/arm-smmu: Add pm_runtime/sleep ops iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device iommu/arm-smmu: Add the device_link between masters and smmu Vivek Gautam (3): base: power: runtime: Export pm_runtime_get/put_suppliers iommu/arm-smmu: Add support for qcom,smmu-v2 variant drm/msm: iommu: Replace runtime calls with runtime suppliers .../devicetree/bindings/iommu/arm,smmu.txt | 43 +++++++ drivers/base/power/runtime.c | 2 + drivers/gpu/drm/msm/msm_iommu.c | 16 +-- drivers/iommu/arm-smmu.c | 127 ++++++++++++++++++++- 4 files changed, 174 insertions(+), 14 deletions(-) -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <1517999482-17317-1-git-send-email-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>]
* [PATCH v7 1/6] base: power: runtime: Export pm_runtime_get/put_suppliers [not found] ` <1517999482-17317-1-git-send-email-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> @ 2018-02-07 10:31 ` Vivek Gautam 2018-02-13 7:44 ` Tomasz Figa 2018-02-07 10:31 ` [PATCH v7 2/6] iommu/arm-smmu: Add pm_runtime/sleep ops Vivek Gautam ` (4 subsequent siblings) 5 siblings, 1 reply; 53+ messages in thread From: Vivek Gautam @ 2018-02-07 10:31 UTC (permalink / raw) To: joro-zLv9SwRftAIdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8, rjw-LthD3rsA81gm4RdzfppkhA, robin.murphy-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8, robdclark-Re5JQEeQqe8AvxtiuMwx3w, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: architt-sgV2jX0FEOL9JmXXK+q4OQ, airlied-cv59FeDIM0c, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, sboyd-sgV2jX0FEOL9JmXXK+q4OQ, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, sricharan-sgV2jX0FEOL9JmXXK+q4OQ, m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ The device link allows the pm framework to tie the supplier and consumer. So, whenever the consumer is powered-on the supplier is powered-on first. There are however cases in which the consumer wants to power-on the supplier, but not itself. E.g., A Graphics or multimedia driver wants to power-on the SMMU to unmap a buffer and finish the TLB operations without powering on itself. Some of these unmap requests are coming from the user space when the controller itself is not powered-up, and it can be huge penalty in terms of power and latency to power-up the graphics/mm controllers. There can be an argument that the supplier should handle this case on its own and there should not be a need for the consumer to power-on the supplier. But as discussed on the thread [1] about ARM-SMMU runtime pm, we don't want to introduce runtime pm calls in atomic path in arm_smmu_unmap. [1] https://patchwork.kernel.org/patch/9827825/ Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> --- drivers/base/power/runtime.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index 8bef3cb2424d..5b8226c8af19 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -1579,6 +1579,7 @@ void pm_runtime_get_suppliers(struct device *dev) device_links_read_unlock(idx); } +EXPORT_SYMBOL_GPL(pm_runtime_get_suppliers); /** * pm_runtime_put_suppliers - Drop references to supplier devices. @@ -1597,6 +1598,7 @@ void pm_runtime_put_suppliers(struct device *dev) device_links_read_unlock(idx); } +EXPORT_SYMBOL_GPL(pm_runtime_put_suppliers); void pm_runtime_new_link(struct device *dev) { -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v7 1/6] base: power: runtime: Export pm_runtime_get/put_suppliers 2018-02-07 10:31 ` [PATCH v7 1/6] base: power: runtime: Export pm_runtime_get/put_suppliers Vivek Gautam @ 2018-02-13 7:44 ` Tomasz Figa [not found] ` <CAAFQd5BmroRf-C8dQkvTKHWK1psGnNi1t7g-q=Xce6KjrGTsdw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-13 7:44 UTC (permalink / raw) To: Vivek Gautam Cc: list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, Rob Herring, Mark Rutland, Rafael J. Wysocki, Robin Murphy, Will Deacon, Rob Clark Hi Vivek, On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam <vivek.gautam@codeaurora.org> wrote: > The device link allows the pm framework to tie the supplier and > consumer. So, whenever the consumer is powered-on the supplier > is powered-on first. > > There are however cases in which the consumer wants to power-on > the supplier, but not itself. > E.g., A Graphics or multimedia driver wants to power-on the SMMU > to unmap a buffer and finish the TLB operations without powering > on itself. This sounds strange to me. If the SMMU is powered down, wouldn't the TLB lose its contents as well (and so no flushing needed)? Other than that, what kind of hardware operations would be needed besides just updating the page tables from the CPU? Best regards, Tomasz ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5BmroRf-C8dQkvTKHWK1psGnNi1t7g-q=Xce6KjrGTsdw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 1/6] base: power: runtime: Export pm_runtime_get/put_suppliers [not found] ` <CAAFQd5BmroRf-C8dQkvTKHWK1psGnNi1t7g-q=Xce6KjrGTsdw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-13 12:00 ` Robin Murphy 2018-02-13 12:54 ` Tomasz Figa 0 siblings, 1 reply; 53+ messages in thread From: Robin Murphy @ 2018-02-13 12:00 UTC (permalink / raw) To: Tomasz Figa, Vivek Gautam Cc: list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS, Joerg Roedel On 13/02/18 07:44, Tomasz Figa wrote: > Hi Vivek, > > On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam > <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >> The device link allows the pm framework to tie the supplier and >> consumer. So, whenever the consumer is powered-on the supplier >> is powered-on first. >> >> There are however cases in which the consumer wants to power-on >> the supplier, but not itself. >> E.g., A Graphics or multimedia driver wants to power-on the SMMU >> to unmap a buffer and finish the TLB operations without powering >> on itself. > > This sounds strange to me. If the SMMU is powered down, wouldn't the > TLB lose its contents as well (and so no flushing needed)? Depends on implementation details - if runtime PM is actually implemented via external clock gating (in the absence of fine-grained power domains), then "suspended" TLBs might both retain state and not receive invalidation requests, which is really the worst case. > Other than that, what kind of hardware operations would be needed > besides just updating the page tables from the CPU? Domain attach/detach also require updating SMMU hardware state (and possibly TLB maintenance), but don't logically require the master device itself to be active at the time. Robin. -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v7 1/6] base: power: runtime: Export pm_runtime_get/put_suppliers 2018-02-13 12:00 ` Robin Murphy @ 2018-02-13 12:54 ` Tomasz Figa 2018-02-13 13:37 ` Robin Murphy 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-13 12:54 UTC (permalink / raw) To: Robin Murphy Cc: Mark Rutland, devicetree, linux-pm, David Airlie, Will Deacon, Rafael J. Wysocki, dri-devel, linux-kernel, list@263.net:IOMMU DRIVERS, Rob Herring, Vivek Gautam, Greg KH, freedreno, sboyd, linux-arm-msm On Tue, Feb 13, 2018 at 9:00 PM, Robin Murphy <robin.murphy@arm.com> wrote: > On 13/02/18 07:44, Tomasz Figa wrote: >> >> Hi Vivek, >> >> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >> <vivek.gautam@codeaurora.org> wrote: >>> >>> The device link allows the pm framework to tie the supplier and >>> consumer. So, whenever the consumer is powered-on the supplier >>> is powered-on first. >>> >>> There are however cases in which the consumer wants to power-on >>> the supplier, but not itself. >>> E.g., A Graphics or multimedia driver wants to power-on the SMMU >>> to unmap a buffer and finish the TLB operations without powering >>> on itself. >> >> >> This sounds strange to me. If the SMMU is powered down, wouldn't the >> TLB lose its contents as well (and so no flushing needed)? > > > Depends on implementation details - if runtime PM is actually implemented > via external clock gating (in the absence of fine-grained power domains), > then "suspended" TLBs might both retain state and not receive invalidation > requests, which is really the worst case. Agreed. That's why in "[PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device" I actually suggested managing clocks separately from runtime PM. At least until runtime PM framework arrives at a state, where multiple power states can be managed, i.e. full power state, clock-gated state, domain-off state. (I think I might have seen some ongoing work on this on LWN though...) > >> Other than that, what kind of hardware operations would be needed >> besides just updating the page tables from the CPU? > > > Domain attach/detach also require updating SMMU hardware state (and possibly > TLB maintenance), but don't logically require the master device itself to be > active at the time. Wouldn't this hardware state need to be reinitialized anyway after respective power domain power cycles? (In other words, hardware would only need programming if it's powered on at the moment.) Best regards, Tomasz _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v7 1/6] base: power: runtime: Export pm_runtime_get/put_suppliers 2018-02-13 12:54 ` Tomasz Figa @ 2018-02-13 13:37 ` Robin Murphy 0 siblings, 0 replies; 53+ messages in thread From: Robin Murphy @ 2018-02-13 13:37 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree, linux-pm, David Airlie, Will Deacon, Rafael J. Wysocki, dri-devel, linux-kernel, list@263.net:IOMMU DRIVERS, Rob Herring, Vivek Gautam, Greg KH, freedreno, sboyd, linux-arm-msm On 13/02/18 12:54, Tomasz Figa wrote: > On Tue, Feb 13, 2018 at 9:00 PM, Robin Murphy <robin.murphy@arm.com> wrote: >> On 13/02/18 07:44, Tomasz Figa wrote: >>> >>> Hi Vivek, >>> >>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >>> <vivek.gautam@codeaurora.org> wrote: >>>> >>>> The device link allows the pm framework to tie the supplier and >>>> consumer. So, whenever the consumer is powered-on the supplier >>>> is powered-on first. >>>> >>>> There are however cases in which the consumer wants to power-on >>>> the supplier, but not itself. >>>> E.g., A Graphics or multimedia driver wants to power-on the SMMU >>>> to unmap a buffer and finish the TLB operations without powering >>>> on itself. >>> >>> >>> This sounds strange to me. If the SMMU is powered down, wouldn't the >>> TLB lose its contents as well (and so no flushing needed)? >> >> >> Depends on implementation details - if runtime PM is actually implemented >> via external clock gating (in the absence of fine-grained power domains), >> then "suspended" TLBs might both retain state and not receive invalidation >> requests, which is really the worst case. > > Agreed. That's why in "[PATCH v7 3/6] iommu/arm-smmu: Invoke > pm_runtime during probe, add/remove device" I actually suggested > managing clocks separately from runtime PM. At least until runtime PM > framework arrives at a state, where multiple power states can be > managed, i.e. full power state, clock-gated state, domain-off state. > (I think I might have seen some ongoing work on this on LWN though...) > >> >>> Other than that, what kind of hardware operations would be needed >>> besides just updating the page tables from the CPU? >> >> >> Domain attach/detach also require updating SMMU hardware state (and possibly >> TLB maintenance), but don't logically require the master device itself to be >> active at the time. > > Wouldn't this hardware state need to be reinitialized anyway after > respective power domain power cycles? (In other words, hardware would > only need programming if it's powered on at the moment.) Yes, if the entire SMMU was fully powered down because all masters were inactive, then all that should need to be done is to update the software shadow state in the expectation that arm_smmu_reset() would re-sync it upon TCU powerup. If at least some part of the internal logic remains active, though, then you may or may not need to fiddle with zero or more clocks and/or power domains (depending on microarchitecture and integration) in order to be sure that everything from the programming slave interface through to wherever that state is kept works correctly so that it can be changed. The main motivation here is that the Qualcomm SMMU microarchitecture apparently allows the programming interface to be shut down separately from the TCU core (context banks, page table walker, etc.), and they get an appreciable power saving from doing so. This is different from, say, the Arm Ltd. implementations, where the entire TCU is a single clock/power domain internally (although you could maybe still gate the external APB interface clock). As the previous discussions have shown, this is really, really hard to do properly in a generic manner. Robin. _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v7 2/6] iommu/arm-smmu: Add pm_runtime/sleep ops [not found] ` <1517999482-17317-1-git-send-email-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> 2018-02-07 10:31 ` [PATCH v7 1/6] base: power: runtime: Export pm_runtime_get/put_suppliers Vivek Gautam @ 2018-02-07 10:31 ` Vivek Gautam 2018-02-13 8:03 ` Tomasz Figa 2018-02-07 10:31 ` [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device Vivek Gautam ` (3 subsequent siblings) 5 siblings, 1 reply; 53+ messages in thread From: Vivek Gautam @ 2018-02-07 10:31 UTC (permalink / raw) To: joro-zLv9SwRftAIdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8, rjw-LthD3rsA81gm4RdzfppkhA, robin.murphy-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8, robdclark-Re5JQEeQqe8AvxtiuMwx3w, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: airlied-cv59FeDIM0c, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, sboyd-sgV2jX0FEOL9JmXXK+q4OQ, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA From: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> The smmu needs to be functional only when the respective master's using it are active. The device_link feature helps to track such functional dependencies, so that the iommu gets powered when the master device enables itself using pm_runtime. So by adapting the smmu driver for runtime pm, above said dependency can be addressed. This patch adds the pm runtime/sleep callbacks to the driver and also the functions to parse the smmu clocks from DT and enable them in resume/suspend. Signed-off-by: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> Signed-off-by: Archit Taneja <architt-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> [vivek: Clock rework to request bulk of clocks] Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> --- drivers/iommu/arm-smmu.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 54 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 69e7c60792a8..9e2f917e16c2 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -48,6 +48,7 @@ #include <linux/of_iommu.h> #include <linux/pci.h> #include <linux/platform_device.h> +#include <linux/pm_runtime.h> #include <linux/slab.h> #include <linux/spinlock.h> @@ -205,6 +206,8 @@ struct arm_smmu_device { u32 num_global_irqs; u32 num_context_irqs; unsigned int *irqs; + struct clk_bulk_data *clocks; + int num_clks; u32 cavium_id_base; /* Specific to Cavium */ @@ -1897,10 +1900,12 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu) struct arm_smmu_match_data { enum arm_smmu_arch_version version; enum arm_smmu_implementation model; + const char * const *clks; + int num_clks; }; #define ARM_SMMU_MATCH_DATA(name, ver, imp) \ -static struct arm_smmu_match_data name = { .version = ver, .model = imp } +static const struct arm_smmu_match_data name = { .version = ver, .model = imp } ARM_SMMU_MATCH_DATA(smmu_generic_v1, ARM_SMMU_V1, GENERIC_SMMU); ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU); @@ -2001,6 +2006,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev, data = of_device_get_match_data(dev); smmu->version = data->version; smmu->model = data->model; + smmu->num_clks = data->num_clks; parse_driver_options(smmu); @@ -2039,6 +2045,28 @@ static void arm_smmu_bus_init(void) #endif } +static int arm_smmu_init_clks(struct arm_smmu_device *smmu) +{ + int i; + int num = smmu->num_clks; + const struct arm_smmu_match_data *data; + + if (num < 1) + return 0; + + smmu->clocks = devm_kcalloc(smmu->dev, num, + sizeof(*smmu->clocks), GFP_KERNEL); + if (!smmu->clocks) + return -ENOMEM; + + data = of_device_get_match_data(smmu->dev); + + for (i = 0; i < num; i++) + smmu->clocks[i].id = data->clks[i]; + + return devm_clk_bulk_get(smmu->dev, num, smmu->clocks); +} + static int arm_smmu_device_probe(struct platform_device *pdev) { struct resource *res; @@ -2099,6 +2127,10 @@ static int arm_smmu_device_probe(struct platform_device *pdev) smmu->irqs[i] = irq; } + err = arm_smmu_init_clks(smmu); + if (err) + return err; + err = arm_smmu_device_cfg_probe(smmu); if (err) return err; @@ -2197,7 +2229,27 @@ static int __maybe_unused arm_smmu_pm_resume(struct device *dev) return 0; } -static SIMPLE_DEV_PM_OPS(arm_smmu_pm_ops, NULL, arm_smmu_pm_resume); +static int __maybe_unused arm_smmu_runtime_resume(struct device *dev) +{ + struct arm_smmu_device *smmu = dev_get_drvdata(dev); + + return clk_bulk_prepare_enable(smmu->num_clks, smmu->clocks); +} + +static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev) +{ + struct arm_smmu_device *smmu = dev_get_drvdata(dev); + + clk_bulk_disable_unprepare(smmu->num_clks, smmu->clocks); + + return 0; +} + +static const struct dev_pm_ops arm_smmu_pm_ops = { + SET_SYSTEM_SLEEP_PM_OPS(NULL, arm_smmu_pm_resume) + SET_RUNTIME_PM_OPS(arm_smmu_runtime_suspend, + arm_smmu_runtime_resume, NULL) +}; static struct platform_driver arm_smmu_driver = { .driver = { -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v7 2/6] iommu/arm-smmu: Add pm_runtime/sleep ops 2018-02-07 10:31 ` [PATCH v7 2/6] iommu/arm-smmu: Add pm_runtime/sleep ops Vivek Gautam @ 2018-02-13 8:03 ` Tomasz Figa [not found] ` <CAAFQd5B2u8RL-tdB3qgPxVUcXnsBSEhXRBZWxqO-w6rYKAiOtg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-13 8:03 UTC (permalink / raw) To: Vivek Gautam Cc: Mark Rutland, devicetree, linux-pm, David Airlie, Will Deacon, Rafael J. Wysocki, list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>, , dri-devel, linux-kernel, Rob Herring, Greg KH, freedreno, Robin Murphy, sboyd, linux-arm-msm Hi Vivek, Thanks for the patch. Please see some comments inline. On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam <vivek.gautam@codeaurora.org> wrote: > From: Sricharan R <sricharan@codeaurora.org> > > The smmu needs to be functional only when the respective > master's using it are active. The device_link feature > helps to track such functional dependencies, so that the > iommu gets powered when the master device enables itself > using pm_runtime. So by adapting the smmu driver for > runtime pm, above said dependency can be addressed. > > This patch adds the pm runtime/sleep callbacks to the > driver and also the functions to parse the smmu clocks > from DT and enable them in resume/suspend. > > Signed-off-by: Sricharan R <sricharan@codeaurora.org> > Signed-off-by: Archit Taneja <architt@codeaurora.org> > [vivek: Clock rework to request bulk of clocks] > Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> > --- > drivers/iommu/arm-smmu.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 54 insertions(+), 2 deletions(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 69e7c60792a8..9e2f917e16c2 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -48,6 +48,7 @@ > #include <linux/of_iommu.h> > #include <linux/pci.h> > #include <linux/platform_device.h> > +#include <linux/pm_runtime.h> > #include <linux/slab.h> > #include <linux/spinlock.h> > > @@ -205,6 +206,8 @@ struct arm_smmu_device { > u32 num_global_irqs; > u32 num_context_irqs; > unsigned int *irqs; > + struct clk_bulk_data *clocks; > + int num_clks; nit: Perhaps "num_clocks" to be consistent with "clocks"? > > u32 cavium_id_base; /* Specific to Cavium */ > > @@ -1897,10 +1900,12 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu) > struct arm_smmu_match_data { > enum arm_smmu_arch_version version; > enum arm_smmu_implementation model; > + const char * const *clks; > + int num_clks; nit: Perhaps s/clks/clocks/ here or s/clocks/clks/ in struct arm_smmu_device? > }; > > #define ARM_SMMU_MATCH_DATA(name, ver, imp) \ > -static struct arm_smmu_match_data name = { .version = ver, .model = imp } > +static const struct arm_smmu_match_data name = { .version = ver, .model = imp } > > ARM_SMMU_MATCH_DATA(smmu_generic_v1, ARM_SMMU_V1, GENERIC_SMMU); > ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU); > @@ -2001,6 +2006,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev, > data = of_device_get_match_data(dev); > smmu->version = data->version; > smmu->model = data->model; > + smmu->num_clks = data->num_clks; > > parse_driver_options(smmu); > > @@ -2039,6 +2045,28 @@ static void arm_smmu_bus_init(void) > #endif > } > > +static int arm_smmu_init_clks(struct arm_smmu_device *smmu) > +{ > + int i; > + int num = smmu->num_clks; > + const struct arm_smmu_match_data *data; > + > + if (num < 1) > + return 0; > + > + smmu->clocks = devm_kcalloc(smmu->dev, num, > + sizeof(*smmu->clocks), GFP_KERNEL); > + if (!smmu->clocks) > + return -ENOMEM; > + > + data = of_device_get_match_data(smmu->dev); > + > + for (i = 0; i < num; i++) > + smmu->clocks[i].id = data->clks[i]; I'd argue that arm_smmu_device_dt_probe() is a better place for all the code above, since this function is called regardless of whether the device is probed from DT or not. Going further, arm_smmu_device_acpi_probe() could fill smmu->num_clks and ->clocks using ACPI-like way (as opposed to OF match data) if necessary. Best regards, Tomasz _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5B2u8RL-tdB3qgPxVUcXnsBSEhXRBZWxqO-w6rYKAiOtg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 2/6] iommu/arm-smmu: Add pm_runtime/sleep ops [not found] ` <CAAFQd5B2u8RL-tdB3qgPxVUcXnsBSEhXRBZWxqO-w6rYKAiOtg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-13 10:25 ` Vivek Gautam 2018-02-14 3:45 ` Tomasz Figa 0 siblings, 1 reply; 53+ messages in thread From: Vivek Gautam @ 2018-02-13 10:25 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, Linux PM, David Airlie, Will Deacon, Rafael J. Wysocki, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, , dri-devel, open list, Rob Herring, Greg KH, freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Stephen Boyd, linux-arm-msm Hi Tomasz, Please find my response inline below. On Tue, Feb 13, 2018 at 1:33 PM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: > Hi Vivek, > > Thanks for the patch. Please see some comments inline. > > On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam > <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >> From: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >> >> The smmu needs to be functional only when the respective >> master's using it are active. The device_link feature >> helps to track such functional dependencies, so that the >> iommu gets powered when the master device enables itself >> using pm_runtime. So by adapting the smmu driver for >> runtime pm, above said dependency can be addressed. >> >> This patch adds the pm runtime/sleep callbacks to the >> driver and also the functions to parse the smmu clocks >> from DT and enable them in resume/suspend. >> >> Signed-off-by: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >> Signed-off-by: Archit Taneja <architt-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >> [vivek: Clock rework to request bulk of clocks] >> Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >> --- >> drivers/iommu/arm-smmu.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++-- >> 1 file changed, 54 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index 69e7c60792a8..9e2f917e16c2 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -48,6 +48,7 @@ >> #include <linux/of_iommu.h> >> #include <linux/pci.h> >> #include <linux/platform_device.h> >> +#include <linux/pm_runtime.h> >> #include <linux/slab.h> >> #include <linux/spinlock.h> >> >> @@ -205,6 +206,8 @@ struct arm_smmu_device { >> u32 num_global_irqs; >> u32 num_context_irqs; >> unsigned int *irqs; >> + struct clk_bulk_data *clocks; >> + int num_clks; > > nit: Perhaps "num_clocks" to be consistent with "clocks"? > >> >> u32 cavium_id_base; /* Specific to Cavium */ >> >> @@ -1897,10 +1900,12 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu) >> struct arm_smmu_match_data { >> enum arm_smmu_arch_version version; >> enum arm_smmu_implementation model; >> + const char * const *clks; >> + int num_clks; > > nit: Perhaps s/clks/clocks/ here or s/clocks/clks/ in struct arm_smmu_device? Sure. Will change to s/clocks/clks/ in struct arm_smmu_device. > >> }; >> >> #define ARM_SMMU_MATCH_DATA(name, ver, imp) \ >> -static struct arm_smmu_match_data name = { .version = ver, .model = imp } >> +static const struct arm_smmu_match_data name = { .version = ver, .model = imp } >> >> ARM_SMMU_MATCH_DATA(smmu_generic_v1, ARM_SMMU_V1, GENERIC_SMMU); >> ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, GENERIC_SMMU); >> @@ -2001,6 +2006,7 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev, >> data = of_device_get_match_data(dev); >> smmu->version = data->version; >> smmu->model = data->model; >> + smmu->num_clks = data->num_clks; >> >> parse_driver_options(smmu); >> >> @@ -2039,6 +2045,28 @@ static void arm_smmu_bus_init(void) >> #endif >> } >> >> +static int arm_smmu_init_clks(struct arm_smmu_device *smmu) >> +{ >> + int i; >> + int num = smmu->num_clks; >> + const struct arm_smmu_match_data *data; >> + >> + if (num < 1) >> + return 0; >> + >> + smmu->clocks = devm_kcalloc(smmu->dev, num, >> + sizeof(*smmu->clocks), GFP_KERNEL); >> + if (!smmu->clocks) >> + return -ENOMEM; >> + >> + data = of_device_get_match_data(smmu->dev); >> + >> + for (i = 0; i < num; i++) >> + smmu->clocks[i].id = data->clks[i]; > > I'd argue that arm_smmu_device_dt_probe() is a better place for all > the code above, since this function is called regardless of whether > the device is probed from DT or not. Going further, > arm_smmu_device_acpi_probe() could fill smmu->num_clks and ->clocks > using ACPI-like way (as opposed to OF match data) if necessary. Right, it's valid to fill the data in arm_smmu_device_dt_probe(). Perhaps we can just keep the devm_clk_bulk_get() in arm_smmu_device_probe() at the point where we are currently doing arm_smmu_init_clks(). Thanks & regards Vivek > > Best regards, > Tomasz > -- > To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v7 2/6] iommu/arm-smmu: Add pm_runtime/sleep ops 2018-02-13 10:25 ` Vivek Gautam @ 2018-02-14 3:45 ` Tomasz Figa 0 siblings, 0 replies; 53+ messages in thread From: Tomasz Figa @ 2018-02-14 3:45 UTC (permalink / raw) To: Vivek Gautam Cc: list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, Rob Herring, Mark Rutland, Rafael J. Wysocki, Robin Murphy, Will Deacon, Rob Clark On Tue, Feb 13, 2018 at 7:25 PM, Vivek Gautam <vivek.gautam@codeaurora.org> wrote: >>> +static int arm_smmu_init_clks(struct arm_smmu_device *smmu) >>> +{ >>> + int i; >>> + int num = smmu->num_clks; >>> + const struct arm_smmu_match_data *data; >>> + >>> + if (num < 1) >>> + return 0; >>> + >>> + smmu->clocks = devm_kcalloc(smmu->dev, num, >>> + sizeof(*smmu->clocks), GFP_KERNEL); >>> + if (!smmu->clocks) >>> + return -ENOMEM; >>> + >>> + data = of_device_get_match_data(smmu->dev); >>> + >>> + for (i = 0; i < num; i++) >>> + smmu->clocks[i].id = data->clks[i]; >> >> I'd argue that arm_smmu_device_dt_probe() is a better place for all >> the code above, since this function is called regardless of whether >> the device is probed from DT or not. Going further, >> arm_smmu_device_acpi_probe() could fill smmu->num_clks and ->clocks >> using ACPI-like way (as opposed to OF match data) if necessary. > > Right, it's valid to fill the data in arm_smmu_device_dt_probe(). > Perhaps we can just keep the devm_clk_bulk_get() in arm_smmu_device_probe() > at the point where we are currently doing arm_smmu_init_clks(). Sounds good to me. Thanks. Best regards, Tomasz ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device [not found] ` <1517999482-17317-1-git-send-email-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> 2018-02-07 10:31 ` [PATCH v7 1/6] base: power: runtime: Export pm_runtime_get/put_suppliers Vivek Gautam 2018-02-07 10:31 ` [PATCH v7 2/6] iommu/arm-smmu: Add pm_runtime/sleep ops Vivek Gautam @ 2018-02-07 10:31 ` Vivek Gautam 2018-02-13 8:24 ` Tomasz Figa [not found] ` <1517999482-17317-4-git-send-email-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> 2018-02-07 10:31 ` [PATCH v7 4/6] iommu/arm-smmu: Add the device_link between masters and smmu Vivek Gautam ` (2 subsequent siblings) 5 siblings, 2 replies; 53+ messages in thread From: Vivek Gautam @ 2018-02-07 10:31 UTC (permalink / raw) To: joro-zLv9SwRftAIdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8, rjw-LthD3rsA81gm4RdzfppkhA, robin.murphy-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8, robdclark-Re5JQEeQqe8AvxtiuMwx3w, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: airlied-cv59FeDIM0c, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, sboyd-sgV2jX0FEOL9JmXXK+q4OQ, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA From: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> The smmu device probe/remove and add/remove master device callbacks gets called when the smmu is not linked to its master, that is without the context of the master device. So calling runtime apis in those places separately. Signed-off-by: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> [vivek: Cleanup pm runtime calls] Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> --- drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 38 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 9e2f917e16c2..c024f69c1682 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -913,11 +913,15 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); struct arm_smmu_device *smmu = smmu_domain->smmu; struct arm_smmu_cfg *cfg = &smmu_domain->cfg; - int irq; + int ret, irq; if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY) return; + ret = pm_runtime_get_sync(smmu->dev); + if (ret) + return; + /* * Disable the context bank and free the page tables before freeing * it. @@ -932,6 +936,8 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) free_io_pgtable_ops(smmu_domain->pgtbl_ops); __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx); + + pm_runtime_put_sync(smmu->dev); } static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) @@ -1407,14 +1413,22 @@ static int arm_smmu_add_device(struct device *dev) while (i--) cfg->smendx[i] = INVALID_SMENDX; - ret = arm_smmu_master_alloc_smes(dev); + ret = pm_runtime_get_sync(smmu->dev); if (ret) goto out_cfg_free; + ret = arm_smmu_master_alloc_smes(dev); + if (ret) + goto out_rpm_put; + iommu_device_link(&smmu->iommu, dev); + pm_runtime_put_sync(smmu->dev); + return 0; +out_rpm_put: + pm_runtime_put_sync(smmu->dev); out_cfg_free: kfree(cfg); out_free: @@ -1427,7 +1441,7 @@ static void arm_smmu_remove_device(struct device *dev) struct iommu_fwspec *fwspec = dev->iommu_fwspec; struct arm_smmu_master_cfg *cfg; struct arm_smmu_device *smmu; - + int ret; if (!fwspec || fwspec->ops != &arm_smmu_ops) return; @@ -1435,8 +1449,15 @@ static void arm_smmu_remove_device(struct device *dev) cfg = fwspec->iommu_priv; smmu = cfg->smmu; + ret = pm_runtime_get_sync(smmu->dev); + if (ret) + return; + iommu_device_unlink(&smmu->iommu, dev); arm_smmu_master_free_smes(fwspec); + + pm_runtime_put_sync(smmu->dev); + iommu_group_remove_device(dev); kfree(fwspec->iommu_priv); iommu_fwspec_free(dev); @@ -2131,6 +2152,14 @@ static int arm_smmu_device_probe(struct platform_device *pdev) if (err) return err; + platform_set_drvdata(pdev, smmu); + + pm_runtime_enable(dev); + + err = pm_runtime_get_sync(dev); + if (err) + return err; + err = arm_smmu_device_cfg_probe(smmu); if (err) return err; @@ -2172,9 +2201,9 @@ static int arm_smmu_device_probe(struct platform_device *pdev) return err; } - platform_set_drvdata(pdev, smmu); arm_smmu_device_reset(smmu); arm_smmu_test_smr_masks(smmu); + pm_runtime_put_sync(dev); /* * For ACPI and generic DT bindings, an SMMU will be probed before @@ -2211,8 +2240,13 @@ static int arm_smmu_device_remove(struct platform_device *pdev) if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS)) dev_err(&pdev->dev, "removing device with active domains!\n"); + pm_runtime_get_sync(smmu->dev); /* Turn the thing off */ writel(sCR0_CLIENTPD, ARM_SMMU_GR0_NS(smmu) + ARM_SMMU_GR0_sCR0); + pm_runtime_put_sync(smmu->dev); + + pm_runtime_disable(smmu->dev); + return 0; } -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device 2018-02-07 10:31 ` [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device Vivek Gautam @ 2018-02-13 8:24 ` Tomasz Figa [not found] ` <CAAFQd5DxYhkK61VDAesby6bT+FtG2nqsbHQRvxkhrsSS0KWtog-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-14 8:28 ` Vivek Gautam [not found] ` <1517999482-17317-4-git-send-email-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> 1 sibling, 2 replies; 53+ messages in thread From: Tomasz Figa @ 2018-02-13 8:24 UTC (permalink / raw) To: Vivek Gautam Cc: list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, Rob Herring, Mark Rutland, Rafael J. Wysocki, Robin Murphy, Will Deacon, Rob Clark Hi Vivek, Thanks for the patch. Please see my comments inline. On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam <vivek.gautam@codeaurora.org> wrote: > From: Sricharan R <sricharan@codeaurora.org> > > The smmu device probe/remove and add/remove master device callbacks > gets called when the smmu is not linked to its master, that is without > the context of the master device. So calling runtime apis in those places > separately. > > Signed-off-by: Sricharan R <sricharan@codeaurora.org> > [vivek: Cleanup pm runtime calls] > Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> > --- > drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 38 insertions(+), 4 deletions(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 9e2f917e16c2..c024f69c1682 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -913,11 +913,15 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); > struct arm_smmu_device *smmu = smmu_domain->smmu; > struct arm_smmu_cfg *cfg = &smmu_domain->cfg; > - int irq; > + int ret, irq; > > if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY) > return; > > + ret = pm_runtime_get_sync(smmu->dev); > + if (ret) > + return; pm_runtime_get_sync() will return 0 if the device was powered off, 1 if it was already/still powered on or runtime PM is not compiled in, or a negative value on error, so shouldn't the test be (ret < 0)? Moreover, I'm actually wondering if it makes any sense to power up the hardware just to program it and power it down again. In a system where the IOMMU is located within a power domain, it would cause the IOMMU block to lose its state anyway. Actually, reflecting back on "[PATCH v7 2/6] iommu/arm-smmu: Add pm_runtime/sleep ops", perhaps it would make more sense to just control the clocks independently of runtime PM? Then, runtime PM could be used for real power management, e.g. really powering the block up and down, for further power saving. +Generally similar comments for other places in this patch. > + > /* > * Disable the context bank and free the page tables before freeing > * it. > @@ -932,6 +936,8 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) > > free_io_pgtable_ops(smmu_domain->pgtbl_ops); > __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx); > + > + pm_runtime_put_sync(smmu->dev); Is there any point in the put being sync here? [snip] > @@ -2131,6 +2152,14 @@ static int arm_smmu_device_probe(struct platform_device *pdev) > if (err) > return err; > > + platform_set_drvdata(pdev, smmu); > + > + pm_runtime_enable(dev); I suspect this may be a disaster for systems where IOMMUs are located inside power domains, because the driver doesn't take care of the IOMMU block losing its state on physical power down, as I mentioned in my comments above. Best regards, Tomasz ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5DxYhkK61VDAesby6bT+FtG2nqsbHQRvxkhrsSS0KWtog-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device [not found] ` <CAAFQd5DxYhkK61VDAesby6bT+FtG2nqsbHQRvxkhrsSS0KWtog-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-13 12:57 ` Robin Murphy [not found] ` <906051dd-8898-ec6f-5ad4-3f37716292cf-5wv7dgnIgG8@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Robin Murphy @ 2018-02-13 12:57 UTC (permalink / raw) To: Tomasz Figa, Vivek Gautam Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, David Airlie, Rafael J. Wysocki, Will Deacon, dri-devel, linux-kernel-u79uwXL29TY76Z2rM5mHXA, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, Rob Herring, Greg KH, freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, sboyd-sgV2jX0FEOL9JmXXK+q4OQ, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA On 13/02/18 08:24, Tomasz Figa wrote: > Hi Vivek, > > Thanks for the patch. Please see my comments inline. > > On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam > <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >> From: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >> >> The smmu device probe/remove and add/remove master device callbacks >> gets called when the smmu is not linked to its master, that is without >> the context of the master device. So calling runtime apis in those places >> separately. >> >> Signed-off-by: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >> [vivek: Cleanup pm runtime calls] >> Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >> --- >> drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++---- >> 1 file changed, 38 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index 9e2f917e16c2..c024f69c1682 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -913,11 +913,15 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) >> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); >> struct arm_smmu_device *smmu = smmu_domain->smmu; >> struct arm_smmu_cfg *cfg = &smmu_domain->cfg; >> - int irq; >> + int ret, irq; >> >> if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY) >> return; >> >> + ret = pm_runtime_get_sync(smmu->dev); >> + if (ret) >> + return; > > pm_runtime_get_sync() will return 0 if the device was powered off, 1 > if it was already/still powered on or runtime PM is not compiled in, > or a negative value on error, so shouldn't the test be (ret < 0)? > > Moreover, I'm actually wondering if it makes any sense to power up the > hardware just to program it and power it down again. In a system where > the IOMMU is located within a power domain, it would cause the IOMMU > block to lose its state anyway. This is generally for the case where the SMMU internal state remains active, but the programming interface needs to be powered up in order to access it. > Actually, reflecting back on "[PATCH v7 2/6] iommu/arm-smmu: Add > pm_runtime/sleep ops", perhaps it would make more sense to just > control the clocks independently of runtime PM? Then, runtime PM could > be used for real power management, e.g. really powering the block up > and down, for further power saving. Unfortunately that ends up pretty much unmanageable, because there are numerous different SMMU microarchitectures with fundamentally different clock/power domain schemes (multiplied by individual SoC integration possibilities). Since this is fundamentally a generic architectural driver, adding explicit clock support would probably make the whole thing about 50% clock code, with complicated decision trees around every hardware access calculating which clocks are necessary for a given operation on a given system. That maintainability aspect is why we've already nacked such a fine-grained approach in the past. Robin. ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <906051dd-8898-ec6f-5ad4-3f37716292cf-5wv7dgnIgG8@public.gmane.org>]
* Re: [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device [not found] ` <906051dd-8898-ec6f-5ad4-3f37716292cf-5wv7dgnIgG8@public.gmane.org> @ 2018-02-13 13:52 ` Tomasz Figa [not found] ` <CAAFQd5DJtQYPg5S3Ep2bK27+D5rQiKuA-uPfMDUon3FudmGF0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-13 13:52 UTC (permalink / raw) To: Robin Murphy Cc: Vivek Gautam, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS, Joerg Roedel, Rob Herring, Mark Rutland, Rafael J. Wysocki, Will Deacon, Rob Clark, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, dri-devel, freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, David Airlie, Greg KH, sboyd-sgV2jX0FEOL9JmXXK+q4OQ, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA On Tue, Feb 13, 2018 at 9:57 PM, Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> wrote: > On 13/02/18 08:24, Tomasz Figa wrote: >> >> Hi Vivek, >> >> Thanks for the patch. Please see my comments inline. >> >> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >> <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >>> >>> From: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >>> >>> The smmu device probe/remove and add/remove master device callbacks >>> gets called when the smmu is not linked to its master, that is without >>> the context of the master device. So calling runtime apis in those places >>> separately. >>> >>> Signed-off-by: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >>> [vivek: Cleanup pm runtime calls] >>> Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >>> --- >>> drivers/iommu/arm-smmu.c | 42 >>> ++++++++++++++++++++++++++++++++++++++---- >>> 1 file changed, 38 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >>> index 9e2f917e16c2..c024f69c1682 100644 >>> --- a/drivers/iommu/arm-smmu.c >>> +++ b/drivers/iommu/arm-smmu.c >>> @@ -913,11 +913,15 @@ static void arm_smmu_destroy_domain_context(struct >>> iommu_domain *domain) >>> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); >>> struct arm_smmu_device *smmu = smmu_domain->smmu; >>> struct arm_smmu_cfg *cfg = &smmu_domain->cfg; >>> - int irq; >>> + int ret, irq; >>> >>> if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY) >>> return; >>> >>> + ret = pm_runtime_get_sync(smmu->dev); >>> + if (ret) >>> + return; >> >> >> pm_runtime_get_sync() will return 0 if the device was powered off, 1 >> if it was already/still powered on or runtime PM is not compiled in, >> or a negative value on error, so shouldn't the test be (ret < 0)? >> >> Moreover, I'm actually wondering if it makes any sense to power up the >> hardware just to program it and power it down again. In a system where >> the IOMMU is located within a power domain, it would cause the IOMMU >> block to lose its state anyway. > > > This is generally for the case where the SMMU internal state remains active, > but the programming interface needs to be powered up in order to access it. That's true for Qualcomm SMMU, but I think that would be different for existing users of the driver? > >> Actually, reflecting back on "[PATCH v7 2/6] iommu/arm-smmu: Add >> pm_runtime/sleep ops", perhaps it would make more sense to just >> control the clocks independently of runtime PM? Then, runtime PM could >> be used for real power management, e.g. really powering the block up >> and down, for further power saving. > > > Unfortunately that ends up pretty much unmanageable, because there are > numerous different SMMU microarchitectures with fundamentally different > clock/power domain schemes (multiplied by individual SoC integration > possibilities). Since this is fundamentally a generic architectural driver, > adding explicit clock support would probably make the whole thing about 50% > clock code, with complicated decision trees around every hardware access > calculating which clocks are necessary for a given operation on a given > system. That maintainability aspect is why we've already nacked such a > fine-grained approach in the past. Hmm, I think we are talking about different things here. My suggestion would not add much more code to the driver than this patch does, calls to arm_smmu_enable_clocks() instead of pm_runtime_get_sync() and arm_smmu_disable_clocks() instead of pm_runtime_put(). The implementation of both functions would be a simple call to clk_bulk_ API (possibly even no need to put this into functions, just call directly). Best regards, Tomasz -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5DJtQYPg5S3Ep2bK27+D5rQiKuA-uPfMDUon3FudmGF0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device [not found] ` <CAAFQd5DJtQYPg5S3Ep2bK27+D5rQiKuA-uPfMDUon3FudmGF0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-14 8:24 ` Vivek Gautam 0 siblings, 0 replies; 53+ messages in thread From: Vivek Gautam @ 2018-02-14 8:24 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, Linux PM, David Airlie, Will Deacon, Rafael J. Wysocki, dri-devel, open list, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS, Rob Herring, Greg KH, freedreno, Stephen Boyd, linux-arm-msm On Tue, Feb 13, 2018 at 7:22 PM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: > On Tue, Feb 13, 2018 at 9:57 PM, Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> wrote: >> On 13/02/18 08:24, Tomasz Figa wrote: >>> >>> Hi Vivek, >>> >>> Thanks for the patch. Please see my comments inline. >>> >>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >>> <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >>>> >>>> From: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >>>> >>>> The smmu device probe/remove and add/remove master device callbacks >>>> gets called when the smmu is not linked to its master, that is without >>>> the context of the master device. So calling runtime apis in those places >>>> separately. >>>> >>>> Signed-off-by: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >>>> [vivek: Cleanup pm runtime calls] >>>> Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >>>> --- >>>> drivers/iommu/arm-smmu.c | 42 >>>> ++++++++++++++++++++++++++++++++++++++---- >>>> 1 file changed, 38 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >>>> index 9e2f917e16c2..c024f69c1682 100644 >>>> --- a/drivers/iommu/arm-smmu.c >>>> +++ b/drivers/iommu/arm-smmu.c >>>> @@ -913,11 +913,15 @@ static void arm_smmu_destroy_domain_context(struct >>>> iommu_domain *domain) >>>> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); >>>> struct arm_smmu_device *smmu = smmu_domain->smmu; >>>> struct arm_smmu_cfg *cfg = &smmu_domain->cfg; >>>> - int irq; >>>> + int ret, irq; >>>> >>>> if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY) >>>> return; >>>> >>>> + ret = pm_runtime_get_sync(smmu->dev); >>>> + if (ret) >>>> + return; >>> >>> >>> pm_runtime_get_sync() will return 0 if the device was powered off, 1 >>> if it was already/still powered on or runtime PM is not compiled in, >>> or a negative value on error, so shouldn't the test be (ret < 0)? >>> >>> Moreover, I'm actually wondering if it makes any sense to power up the >>> hardware just to program it and power it down again. In a system where >>> the IOMMU is located within a power domain, it would cause the IOMMU >>> block to lose its state anyway. >> >> >> This is generally for the case where the SMMU internal state remains active, >> but the programming interface needs to be powered up in order to access it. > > That's true for Qualcomm SMMU, but I think that would be different for > existing users of the driver? > >> >>> Actually, reflecting back on "[PATCH v7 2/6] iommu/arm-smmu: Add >>> pm_runtime/sleep ops", perhaps it would make more sense to just >>> control the clocks independently of runtime PM? Then, runtime PM could >>> be used for real power management, e.g. really powering the block up >>> and down, for further power saving. >> >> >> Unfortunately that ends up pretty much unmanageable, because there are >> numerous different SMMU microarchitectures with fundamentally different >> clock/power domain schemes (multiplied by individual SoC integration >> possibilities). Since this is fundamentally a generic architectural driver, >> adding explicit clock support would probably make the whole thing about 50% >> clock code, with complicated decision trees around every hardware access >> calculating which clocks are necessary for a given operation on a given >> system. That maintainability aspect is why we've already nacked such a >> fine-grained approach in the past. > > Hmm, I think we are talking about different things here. My suggestion > would not add much more code to the driver than this patch does, calls > to arm_smmu_enable_clocks() instead of pm_runtime_get_sync() and > arm_smmu_disable_clocks() instead of pm_runtime_put(). The > implementation of both functions would be a simple call to clk_bulk_ > API (possibly even no need to put this into functions, just call > directly). Well, things are not so straight on msm. The IP clocks on msm are usually powered by (or i should rather say, controlled by) the same power domain that provides the VDD supply to iommu block. This is the behavior on msm8996 atleast that we are testing on right now. On later SoCs too things don't change drastically. So, you can't have the block in low power state until you program the register space and then power on the block to let it do its magic. Clocks and power domains are linked, and that's why we add them to the pm callbacks. This approach also looks generic to me since the platforms will either have such a link or they will not have. But, in either case you will have power and clocks available at the time when you need them. Thanks & regards Vivek > > Best regards, > Tomasz > -- > To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device 2018-02-13 8:24 ` Tomasz Figa [not found] ` <CAAFQd5DxYhkK61VDAesby6bT+FtG2nqsbHQRvxkhrsSS0KWtog-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-14 8:28 ` Vivek Gautam 1 sibling, 0 replies; 53+ messages in thread From: Vivek Gautam @ 2018-02-14 8:28 UTC (permalink / raw) To: Tomasz Figa Cc: list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, Rob Herring, Mark Rutland, Rafael J. Wysocki, Robin Murphy, Will Deacon, Rob Clark Hi Tomasz, On Tue, Feb 13, 2018 at 1:54 PM, Tomasz Figa <tfiga@chromium.org> wrote: > Hi Vivek, > > Thanks for the patch. Please see my comments inline. > > On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam > <vivek.gautam@codeaurora.org> wrote: >> From: Sricharan R <sricharan@codeaurora.org> >> >> The smmu device probe/remove and add/remove master device callbacks >> gets called when the smmu is not linked to its master, that is without >> the context of the master device. So calling runtime apis in those places >> separately. >> >> Signed-off-by: Sricharan R <sricharan@codeaurora.org> >> [vivek: Cleanup pm runtime calls] >> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >> --- >> drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++---- >> 1 file changed, 38 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index 9e2f917e16c2..c024f69c1682 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -913,11 +913,15 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) >> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); >> struct arm_smmu_device *smmu = smmu_domain->smmu; >> struct arm_smmu_cfg *cfg = &smmu_domain->cfg; >> - int irq; >> + int ret, irq; >> >> if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY) >> return; >> >> + ret = pm_runtime_get_sync(smmu->dev); >> + if (ret) >> + return; > > pm_runtime_get_sync() will return 0 if the device was powered off, 1 > if it was already/still powered on or runtime PM is not compiled in, > or a negative value on error, so shouldn't the test be (ret < 0)? Yes, I too noticed it while i was testing on a different platform, and was hitting a failure case. Will update at all places. > > Moreover, I'm actually wondering if it makes any sense to power up the > hardware just to program it and power it down again. In a system where > the IOMMU is located within a power domain, it would cause the IOMMU > block to lose its state anyway. > > Actually, reflecting back on "[PATCH v7 2/6] iommu/arm-smmu: Add > pm_runtime/sleep ops", perhaps it would make more sense to just > control the clocks independently of runtime PM? Then, runtime PM could > be used for real power management, e.g. really powering the block up > and down, for further power saving. > > +Generally similar comments for other places in this patch. > >> + >> /* >> * Disable the context bank and free the page tables before freeing >> * it. >> @@ -932,6 +936,8 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) >> >> free_io_pgtable_ops(smmu_domain->pgtbl_ops); >> __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx); >> + >> + pm_runtime_put_sync(smmu->dev); > > Is there any point in the put being sync here? No, I don't think. Can manage with just a 'put' here. Will modify. best regards Vivek > > [snip] > >> @@ -2131,6 +2152,14 @@ static int arm_smmu_device_probe(struct platform_device *pdev) >> if (err) >> return err; >> >> + platform_set_drvdata(pdev, smmu); >> + >> + pm_runtime_enable(dev); > > I suspect this may be a disaster for systems where IOMMUs are located > inside power domains, because the driver doesn't take care of the > IOMMU block losing its state on physical power down, as I mentioned in > my comments above. > > Best regards, > Tomasz -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <1517999482-17317-4-git-send-email-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>]
* Re: [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device [not found] ` <1517999482-17317-4-git-send-email-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> @ 2018-02-22 23:52 ` Jordan Crouse [not found] ` <20180222235200.GA18743-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Jordan Crouse @ 2018-02-22 23:52 UTC (permalink / raw) To: Vivek Gautam Cc: mark.rutland-5wv7dgnIgG8, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, airlied-cv59FeDIM0c, will.deacon-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A, rjw-LthD3rsA81gm4RdzfppkhA, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, linux-kernel-u79uwXL29TY76Z2rM5mHXA, robdclark-Re5JQEeQqe8AvxtiuMwx3w, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, robin.murphy-5wv7dgnIgG8, sboyd-sgV2jX0FEOL9JmXXK+q4OQ, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA On Wed, Feb 07, 2018 at 04:01:19PM +0530, Vivek Gautam wrote: > From: Sricharan R <sricharan@codeaurora.org> > > The smmu device probe/remove and add/remove master device callbacks > gets called when the smmu is not linked to its master, that is without > the context of the master device. So calling runtime apis in those places > separately. > > Signed-off-by: Sricharan R <sricharan@codeaurora.org> > [vivek: Cleanup pm runtime calls] > Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> > --- > drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 38 insertions(+), 4 deletions(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 9e2f917e16c2..c024f69c1682 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -913,11 +913,15 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); > struct arm_smmu_device *smmu = smmu_domain->smmu; > struct arm_smmu_cfg *cfg = &smmu_domain->cfg; > - int irq; > + int ret, irq; > > if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY) > return; > > + ret = pm_runtime_get_sync(smmu->dev); > + if (ret) > + return; > + > /* > * Disable the context bank and free the page tables before freeing > * it. > @@ -932,6 +936,8 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) > > free_io_pgtable_ops(smmu_domain->pgtbl_ops); > __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx); > + > + pm_runtime_put_sync(smmu->dev); > } > > static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) > @@ -1407,14 +1413,22 @@ static int arm_smmu_add_device(struct device *dev) > while (i--) > cfg->smendx[i] = INVALID_SMENDX; > > - ret = arm_smmu_master_alloc_smes(dev); > + ret = pm_runtime_get_sync(smmu->dev); > if (ret) > goto out_cfg_free; Hey Vivek, I just hit a problem with this on sdm845. It turns out that pm_runtime_get_sync() returns a positive 1 if the device is already active. I hit this in the GPU code. The a6xx has two platform devices that each use a different sid on the iommu. The GPU is probed normally from a platform driver and it in turn initializes the GMU device by way of a phandle. Because the GMU isn't probed with a platform driver we need to call of_dma_configure() on the device to set up the IOMMU for the device which ends up calling through this path and we discover that the smmu->dev is already powered (pm_runtime_get_sync returns 1). I'm not immediately sure if this is a bug on sdm845 or not because a cursory inspection says that the SMMU device shouldn't be powered at this time but there might be a connection that I'm not seeing. Obviously if the SMMU was left powered thats a bad thing. But putting that aside it is obvious that this code should be accommodating of the possibility that the device is already powered, and so this should be if (ret < 0) goto out_cfg_free; With that the GPU/GMU successfully comes up on Sean Paul's display testing branch. Jordan -- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <20180222235200.GA18743-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>]
* Re: [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device [not found] ` <20180222235200.GA18743-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org> @ 2018-02-23 10:36 ` Vivek Gautam [not found] ` <CAFp+6iGQ5Vckui14Jb=V0uk_Pjes95hOxo=KBijR4yxPeDDzFg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Vivek Gautam @ 2018-02-23 10:36 UTC (permalink / raw) To: Vivek Gautam, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, , robh+dt, Mark Rutland, Rafael J. Wysocki, Robin Murphy, Will Deacon, Rob Clark On Fri, Feb 23, 2018 at 5:22 AM, Jordan Crouse <jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: > On Wed, Feb 07, 2018 at 04:01:19PM +0530, Vivek Gautam wrote: >> From: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >> >> The smmu device probe/remove and add/remove master device callbacks >> gets called when the smmu is not linked to its master, that is without >> the context of the master device. So calling runtime apis in those places >> separately. >> >> Signed-off-by: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >> [vivek: Cleanup pm runtime calls] >> Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >> --- >> drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++---- >> 1 file changed, 38 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index 9e2f917e16c2..c024f69c1682 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -913,11 +913,15 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) >> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); >> struct arm_smmu_device *smmu = smmu_domain->smmu; >> struct arm_smmu_cfg *cfg = &smmu_domain->cfg; >> - int irq; >> + int ret, irq; >> >> if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY) >> return; >> >> + ret = pm_runtime_get_sync(smmu->dev); >> + if (ret) >> + return; >> + >> /* >> * Disable the context bank and free the page tables before freeing >> * it. >> @@ -932,6 +936,8 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) >> >> free_io_pgtable_ops(smmu_domain->pgtbl_ops); >> __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx); >> + >> + pm_runtime_put_sync(smmu->dev); >> } >> >> static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) >> @@ -1407,14 +1413,22 @@ static int arm_smmu_add_device(struct device *dev) >> while (i--) >> cfg->smendx[i] = INVALID_SMENDX; >> >> - ret = arm_smmu_master_alloc_smes(dev); >> + ret = pm_runtime_get_sync(smmu->dev); >> if (ret) >> goto out_cfg_free; > > Hey Vivek, I just hit a problem with this on sdm845. It turns out that > pm_runtime_get_sync() returns a positive 1 if the device is already active. > > I hit this in the GPU code. The a6xx has two platform devices that each use a > different sid on the iommu. The GPU is probed normally from a platform driver > and it in turn initializes the GMU device by way of a phandle. > > Because the GMU isn't probed with a platform driver we need to call > of_dma_configure() on the device to set up the IOMMU for the device which ends > up calling through this path and we discover that the smmu->dev is already > powered (pm_runtime_get_sync returns 1). > > I'm not immediately sure if this is a bug on sdm845 or not because a cursory > inspection says that the SMMU device shouldn't be powered at this time but there > might be a connection that I'm not seeing. Obviously if the SMMU was left > powered thats a bad thing. But putting that aside it is obvious that this > code should be accommodating of the possibility that the device is already > powered, and so this should be > > if (ret < 0) > goto out_cfg_free; Right, as Tomasz also pointed, we should surely check the negative value of pm_runtime_get_sync(). >From your description, it may be that the GPU has turned on the smmu, and then once if goes and probes the GMU, the GMU device also wants to turn-on the same smmu device. But that's already active. So pm_runtime_get_sync() returns 1. Am i making sense? regards Vivek > > With that the GPU/GMU successfully comes up on Sean Paul's display testing > branch. > > Jordan > > -- > The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > a Linux Foundation Collaborative Project > -- > To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAFp+6iGQ5Vckui14Jb=V0uk_Pjes95hOxo=KBijR4yxPeDDzFg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [Freedreno] [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device [not found] ` <CAFp+6iGQ5Vckui14Jb=V0uk_Pjes95hOxo=KBijR4yxPeDDzFg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-23 15:40 ` Jordan Crouse [not found] ` <20180223154048.GB18743-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Jordan Crouse @ 2018-02-23 15:40 UTC (permalink / raw) To: Vivek Gautam Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, Linux PM, David Airlie, Will Deacon, Rafael J. Wysocki, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, , dri-devel, open list, robh+dt, Greg KH, freedreno, Stephen Boyd, linux-arm-msm On Fri, Feb 23, 2018 at 04:06:39PM +0530, Vivek Gautam wrote: > On Fri, Feb 23, 2018 at 5:22 AM, Jordan Crouse <jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: > > On Wed, Feb 07, 2018 at 04:01:19PM +0530, Vivek Gautam wrote: > >> From: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> > >> > >> The smmu device probe/remove and add/remove master device callbacks > >> gets called when the smmu is not linked to its master, that is without > >> the context of the master device. So calling runtime apis in those places > >> separately. > >> > >> Signed-off-by: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> > >> [vivek: Cleanup pm runtime calls] > >> Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> > >> --- > >> drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++---- > >> 1 file changed, 38 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > >> index 9e2f917e16c2..c024f69c1682 100644 > >> --- a/drivers/iommu/arm-smmu.c > >> +++ b/drivers/iommu/arm-smmu.c > >> @@ -913,11 +913,15 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) > >> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); > >> struct arm_smmu_device *smmu = smmu_domain->smmu; > >> struct arm_smmu_cfg *cfg = &smmu_domain->cfg; > >> - int irq; > >> + int ret, irq; > >> > >> if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY) > >> return; > >> > >> + ret = pm_runtime_get_sync(smmu->dev); > >> + if (ret) > >> + return; > >> + > >> /* > >> * Disable the context bank and free the page tables before freeing > >> * it. > >> @@ -932,6 +936,8 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) > >> > >> free_io_pgtable_ops(smmu_domain->pgtbl_ops); > >> __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx); > >> + > >> + pm_runtime_put_sync(smmu->dev); > >> } > >> > >> static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) > >> @@ -1407,14 +1413,22 @@ static int arm_smmu_add_device(struct device *dev) > >> while (i--) > >> cfg->smendx[i] = INVALID_SMENDX; > >> > >> - ret = arm_smmu_master_alloc_smes(dev); > >> + ret = pm_runtime_get_sync(smmu->dev); > >> if (ret) > >> goto out_cfg_free; > > > > Hey Vivek, I just hit a problem with this on sdm845. It turns out that > > pm_runtime_get_sync() returns a positive 1 if the device is already active. > > > > I hit this in the GPU code. The a6xx has two platform devices that each use a > > different sid on the iommu. The GPU is probed normally from a platform driver > > and it in turn initializes the GMU device by way of a phandle. > > > > Because the GMU isn't probed with a platform driver we need to call > > of_dma_configure() on the device to set up the IOMMU for the device which ends > > up calling through this path and we discover that the smmu->dev is already > > powered (pm_runtime_get_sync returns 1). > > > > I'm not immediately sure if this is a bug on sdm845 or not because a cursory > > inspection says that the SMMU device shouldn't be powered at this time but there > > might be a connection that I'm not seeing. Obviously if the SMMU was left > > powered thats a bad thing. But putting that aside it is obvious that this > > code should be accommodating of the possibility that the device is already > > powered, and so this should be > > > > if (ret < 0) > > goto out_cfg_free; > > Right, as Tomasz also pointed, we should surely check the negative value of > pm_runtime_get_sync(). Sorry, I didn't notice that Tomasz had pointed it out as well. I wanted to quickly get it on the mailing list so you could catch it in your time zone. > From your description, it may be that the GPU has turned on the smmu, and > then once if goes and probes the GMU, the GMU device also wants to turn-on > the same smmu device. But that's already active. So pm_runtime_get_sync() > returns 1. > Am i making sense? My concern is that this is happening during the probe and we shouldn't be energizing the GPU at this point. But it is entirely possible that the bus is on for other reasons. I'll do a bit of digging today and see exactly which device is at fault. Jordan -- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <20180223154048.GB18743-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>]
* Re: [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device [not found] ` <20180223154048.GB18743-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org> @ 2018-02-23 17:43 ` Vivek Gautam 0 siblings, 0 replies; 53+ messages in thread From: Vivek Gautam @ 2018-02-23 17:43 UTC (permalink / raw) To: Vivek Gautam, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, , robh+dt, Mark Rutland, Rafael J. Wysocki, Robin Murphy, Will Deacon, Rob Clark On Fri, Feb 23, 2018 at 9:10 PM, Jordan Crouse <jcrouse@codeaurora.org> wrote: > On Fri, Feb 23, 2018 at 04:06:39PM +0530, Vivek Gautam wrote: >> On Fri, Feb 23, 2018 at 5:22 AM, Jordan Crouse <jcrouse@codeaurora.org> wrote: >> > On Wed, Feb 07, 2018 at 04:01:19PM +0530, Vivek Gautam wrote: >> >> From: Sricharan R <sricharan@codeaurora.org> >> >> >> >> The smmu device probe/remove and add/remove master device callbacks >> >> gets called when the smmu is not linked to its master, that is without >> >> the context of the master device. So calling runtime apis in those places >> >> separately. >> >> >> >> Signed-off-by: Sricharan R <sricharan@codeaurora.org> >> >> [vivek: Cleanup pm runtime calls] >> >> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >> >> --- >> >> drivers/iommu/arm-smmu.c | 42 ++++++++++++++++++++++++++++++++++++++---- >> >> 1 file changed, 38 insertions(+), 4 deletions(-) >> >> >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> >> index 9e2f917e16c2..c024f69c1682 100644 >> >> --- a/drivers/iommu/arm-smmu.c >> >> +++ b/drivers/iommu/arm-smmu.c >> >> @@ -913,11 +913,15 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) >> >> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); >> >> struct arm_smmu_device *smmu = smmu_domain->smmu; >> >> struct arm_smmu_cfg *cfg = &smmu_domain->cfg; >> >> - int irq; >> >> + int ret, irq; >> >> >> >> if (!smmu || domain->type == IOMMU_DOMAIN_IDENTITY) >> >> return; >> >> >> >> + ret = pm_runtime_get_sync(smmu->dev); >> >> + if (ret) >> >> + return; >> >> + >> >> /* >> >> * Disable the context bank and free the page tables before freeing >> >> * it. >> >> @@ -932,6 +936,8 @@ static void arm_smmu_destroy_domain_context(struct iommu_domain *domain) >> >> >> >> free_io_pgtable_ops(smmu_domain->pgtbl_ops); >> >> __arm_smmu_free_bitmap(smmu->context_map, cfg->cbndx); >> >> + >> >> + pm_runtime_put_sync(smmu->dev); >> >> } >> >> >> >> static struct iommu_domain *arm_smmu_domain_alloc(unsigned type) >> >> @@ -1407,14 +1413,22 @@ static int arm_smmu_add_device(struct device *dev) >> >> while (i--) >> >> cfg->smendx[i] = INVALID_SMENDX; >> >> >> >> - ret = arm_smmu_master_alloc_smes(dev); >> >> + ret = pm_runtime_get_sync(smmu->dev); >> >> if (ret) >> >> goto out_cfg_free; >> > >> > Hey Vivek, I just hit a problem with this on sdm845. It turns out that >> > pm_runtime_get_sync() returns a positive 1 if the device is already active. >> > >> > I hit this in the GPU code. The a6xx has two platform devices that each use a >> > different sid on the iommu. The GPU is probed normally from a platform driver >> > and it in turn initializes the GMU device by way of a phandle. >> > >> > Because the GMU isn't probed with a platform driver we need to call >> > of_dma_configure() on the device to set up the IOMMU for the device which ends >> > up calling through this path and we discover that the smmu->dev is already >> > powered (pm_runtime_get_sync returns 1). >> > >> > I'm not immediately sure if this is a bug on sdm845 or not because a cursory >> > inspection says that the SMMU device shouldn't be powered at this time but there >> > might be a connection that I'm not seeing. Obviously if the SMMU was left >> > powered thats a bad thing. But putting that aside it is obvious that this >> > code should be accommodating of the possibility that the device is already >> > powered, and so this should be >> > >> > if (ret < 0) >> > goto out_cfg_free; >> >> Right, as Tomasz also pointed, we should surely check the negative value of >> pm_runtime_get_sync(). > > Sorry, I didn't notice that Tomasz had pointed it out as well. I wanted to > quickly get it on the mailing list so you could catch it in your time zone. > >> From your description, it may be that the GPU has turned on the smmu, and >> then once if goes and probes the GMU, the GMU device also wants to turn-on >> the same smmu device. But that's already active. So pm_runtime_get_sync() >> returns 1. >> Am i making sense? > > My concern is that this is happening during the probe and we shouldn't be > energizing the GPU at this point. But it is entirely possible that the > bus is on for other reasons. I'll do a bit of digging today and see exactly > which device is at fault. I will try to check it myself too. regards Vivek > > > Jordan > -- > The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > a Linux Foundation Collaborative Project > -- > To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v7 4/6] iommu/arm-smmu: Add the device_link between masters and smmu [not found] ` <1517999482-17317-1-git-send-email-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> ` (2 preceding siblings ...) 2018-02-07 10:31 ` [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device Vivek Gautam @ 2018-02-07 10:31 ` Vivek Gautam 2018-02-13 8:31 ` Tomasz Figa 2018-02-07 10:31 ` [PATCH v7 5/6] iommu/arm-smmu: Add support for qcom, smmu-v2 variant Vivek Gautam 2018-02-07 10:31 ` [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers Vivek Gautam 5 siblings, 1 reply; 53+ messages in thread From: Vivek Gautam @ 2018-02-07 10:31 UTC (permalink / raw) To: joro-zLv9SwRftAIdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8, rjw-LthD3rsA81gm4RdzfppkhA, robin.murphy-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8, robdclark-Re5JQEeQqe8AvxtiuMwx3w, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: airlied-cv59FeDIM0c, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, sboyd-sgV2jX0FEOL9JmXXK+q4OQ, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA From: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> Finally add the device link between the master device and smmu, so that the smmu gets runtime enabled/disabled only when the master needs it. This is done from add_device callback which gets called once when the master is added to the smmu. Signed-off-by: Sricharan R <sricharan-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> --- drivers/iommu/arm-smmu.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index c024f69c1682..c7e924d553bd 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -215,6 +215,9 @@ struct arm_smmu_device { /* IOMMU core code handle */ struct iommu_device iommu; + + /* runtime PM link to master */ + struct device_link *link; }; enum arm_smmu_context_fmt { @@ -1425,6 +1428,17 @@ static int arm_smmu_add_device(struct device *dev) pm_runtime_put_sync(smmu->dev); + /* + * Establish the link between smmu and master, so that the + * smmu gets runtime enabled/disabled as per the master's + * needs. + */ + smmu->link = device_link_add(dev, smmu->dev, DL_FLAG_PM_RUNTIME); + if (!smmu->link) + dev_warn(smmu->dev, + "Unable to create device link between %s and %s\n", + dev_name(smmu->dev), dev_name(dev)); + return 0; out_rpm_put: @@ -1449,6 +1463,8 @@ static void arm_smmu_remove_device(struct device *dev) cfg = fwspec->iommu_priv; smmu = cfg->smmu; + device_link_del(smmu->link); + ret = pm_runtime_get_sync(smmu->dev); if (ret) return; -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v7 4/6] iommu/arm-smmu: Add the device_link between masters and smmu 2018-02-07 10:31 ` [PATCH v7 4/6] iommu/arm-smmu: Add the device_link between masters and smmu Vivek Gautam @ 2018-02-13 8:31 ` Tomasz Figa [not found] ` <CAAFQd5BQAs9=N27_Z0pJNSrndFY3vFin2KBP44UtiW+tMXy5nQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-13 8:31 UTC (permalink / raw) To: Vivek Gautam Cc: list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, Rob Herring, Mark Rutland, Rafael J. Wysocki, Robin Murphy, Will Deacon, Rob Clark Hi Vivek, Thanks for the patch. Please see my comments inline. On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam <vivek.gautam@codeaurora.org> wrote: > From: Sricharan R <sricharan@codeaurora.org> > > Finally add the device link between the master device and > smmu, so that the smmu gets runtime enabled/disabled only when the > master needs it. This is done from add_device callback which gets > called once when the master is added to the smmu. > > Signed-off-by: Sricharan R <sricharan@codeaurora.org> > Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> > --- > drivers/iommu/arm-smmu.c | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index c024f69c1682..c7e924d553bd 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -215,6 +215,9 @@ struct arm_smmu_device { > > /* IOMMU core code handle */ > struct iommu_device iommu; > + > + /* runtime PM link to master */ > + struct device_link *link; > }; > > enum arm_smmu_context_fmt { > @@ -1425,6 +1428,17 @@ static int arm_smmu_add_device(struct device *dev) > > pm_runtime_put_sync(smmu->dev); > > + /* > + * Establish the link between smmu and master, so that the > + * smmu gets runtime enabled/disabled as per the master's > + * needs. > + */ > + smmu->link = device_link_add(dev, smmu->dev, DL_FLAG_PM_RUNTIME); > + if (!smmu->link) > + dev_warn(smmu->dev, > + "Unable to create device link between %s and %s\n", > + dev_name(smmu->dev), dev_name(dev)); How likely it is that the master can work normally even if the link add fails? Perhaps we should just return an error here? > + > return 0; > > out_rpm_put: > @@ -1449,6 +1463,8 @@ static void arm_smmu_remove_device(struct device *dev) > cfg = fwspec->iommu_priv; > smmu = cfg->smmu; > > + device_link_del(smmu->link); We allowed smmu->link in arm_smmu_add_device(), but here we don't check it. Looking at the code, device_link_del() doesn't seem to check either. Note that this problem would go away if we fail add_device on device_link_add() failure, as I suggested above, so no change would be necessary. Best regards, Tomasz ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5BQAs9=N27_Z0pJNSrndFY3vFin2KBP44UtiW+tMXy5nQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 4/6] iommu/arm-smmu: Add the device_link between masters and smmu [not found] ` <CAAFQd5BQAs9=N27_Z0pJNSrndFY3vFin2KBP44UtiW+tMXy5nQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-13 10:14 ` Vivek Gautam 0 siblings, 0 replies; 53+ messages in thread From: Vivek Gautam @ 2018-02-13 10:14 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, Linux PM, David Airlie, Will Deacon, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, , Rafael J. Wysocki Hi Tomasz, On Tue, Feb 13, 2018 at 2:01 PM, Tomasz Figa <tfiga@chromium.org> wrote: > Hi Vivek, > > Thanks for the patch. Please see my comments inline. Thanks for reviewing the patch series. > > On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam > <vivek.gautam@codeaurora.org> wrote: >> From: Sricharan R <sricharan@codeaurora.org> >> >> Finally add the device link between the master device and >> smmu, so that the smmu gets runtime enabled/disabled only when the >> master needs it. This is done from add_device callback which gets >> called once when the master is added to the smmu. >> >> Signed-off-by: Sricharan R <sricharan@codeaurora.org> >> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >> --- >> drivers/iommu/arm-smmu.c | 16 ++++++++++++++++ >> 1 file changed, 16 insertions(+) >> >> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> index c024f69c1682..c7e924d553bd 100644 >> --- a/drivers/iommu/arm-smmu.c >> +++ b/drivers/iommu/arm-smmu.c >> @@ -215,6 +215,9 @@ struct arm_smmu_device { >> >> /* IOMMU core code handle */ >> struct iommu_device iommu; >> + >> + /* runtime PM link to master */ >> + struct device_link *link; >> }; >> >> enum arm_smmu_context_fmt { >> @@ -1425,6 +1428,17 @@ static int arm_smmu_add_device(struct device *dev) >> >> pm_runtime_put_sync(smmu->dev); >> >> + /* >> + * Establish the link between smmu and master, so that the >> + * smmu gets runtime enabled/disabled as per the master's >> + * needs. >> + */ >> + smmu->link = device_link_add(dev, smmu->dev, DL_FLAG_PM_RUNTIME); >> + if (!smmu->link) >> + dev_warn(smmu->dev, >> + "Unable to create device link between %s and %s\n", >> + dev_name(smmu->dev), dev_name(dev)); > > How likely it is that the master can work normally even if the link > add fails? Perhaps we should just return an error here? Right. We are assuming that the power is handled for most of the smmu operations, after we add the master with smmu, based on the fact that the device link is successful. We should return error code here. Will make the necessary change. > >> + >> return 0; >> >> out_rpm_put: >> @@ -1449,6 +1463,8 @@ static void arm_smmu_remove_device(struct device *dev) >> cfg = fwspec->iommu_priv; >> smmu = cfg->smmu; >> >> + device_link_del(smmu->link); > > We allowed smmu->link in arm_smmu_add_device(), but here we don't > check it. Looking at the code, device_link_del() doesn't seem to check > either. > > Note that this problem would go away if we fail add_device on > device_link_add() failure, as I suggested above, so no change would be > necessary. Sure. After making the above change, this should also be handled. Best regards Vivek > > Best regards, > Tomasz -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v7 5/6] iommu/arm-smmu: Add support for qcom, smmu-v2 variant [not found] ` <1517999482-17317-1-git-send-email-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> ` (3 preceding siblings ...) 2018-02-07 10:31 ` [PATCH v7 4/6] iommu/arm-smmu: Add the device_link between masters and smmu Vivek Gautam @ 2018-02-07 10:31 ` Vivek Gautam 2018-02-09 10:57 ` [PATCH v8 5/6] iommu/arm-smmu: Add support for qcom,smmu-v2 variant Vivek Gautam 2018-02-07 10:31 ` [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers Vivek Gautam 5 siblings, 1 reply; 53+ messages in thread From: Vivek Gautam @ 2018-02-07 10:31 UTC (permalink / raw) To: joro-zLv9SwRftAIdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8, rjw-LthD3rsA81gm4RdzfppkhA, robin.murphy-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8, robdclark-Re5JQEeQqe8AvxtiuMwx3w, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: architt-sgV2jX0FEOL9JmXXK+q4OQ, airlied-cv59FeDIM0c, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, sboyd-sgV2jX0FEOL9JmXXK+q4OQ, alex.williamson-H+wXaHxf7aLQT0dZR+AlfA, vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, sricharan-sgV2jX0FEOL9JmXXK+q4OQ, m.szyprowski-Sze3O3UU22JBDgjK7y7TUQ qcom,smmu-v2 is an arm,smmu-v2 implementation with specific clock and power requirements. This smmu core is used with multiple masters on msm8996, viz. mdss, video, etc. Add bindings for the same. Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> Reviewed-by: Rob Herring <robh@kernel.org> --- .../devicetree/bindings/iommu/arm,smmu.txt | 43 ++++++++++++++++++++++ drivers/iommu/arm-smmu.c | 13 +++++++ 2 files changed, 56 insertions(+) diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt index 8a6ffce12af5..169222ae2706 100644 --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt @@ -17,10 +17,19 @@ conditions. "arm,mmu-401" "arm,mmu-500" "cavium,smmu-v2" + "qcom,<soc>-smmu-v2", "qcom,smmu-v2" depending on the particular implementation and/or the version of the architecture implemented. + A number of Qcom SoCs use qcom,smmu-v2 version of the IP. + "qcom,<soc>-smmu-v2" represents a soc specific compatible + string that should be present along with the "qcom,smmu-v2" + to facilitate SoC specific clocks/power connections and to + address specific bug fixes. + An example string would be - + "qcom,msm8996-smmu-v2", "qcom,smmu-v2". + - reg : Base address and size of the SMMU. - #global-interrupts : The number of global interrupts exposed by the @@ -71,6 +80,23 @@ conditions. or using stream matching with #iommu-cells = <2>, and may be ignored if present in such cases. +- clock-names: Should be "bus", and "iface" for "qcom,smmu-v2" + implementation. + + "bus" clock for "qcom,smmu-v2" is required for downstream + bus access and for the smmu ptw. + + "iface" clock is required to access smmu's registers through + the TCU's programming interface. + +- clocks: Phandles for respective clocks described by clock-names. + +- power-domains: Phandles to SMMU's power domain specifier. This is + required even if SMMU belongs to the master's power + domain, as the SMMU will have to be enabled and + accessed before master gets enabled and linked to its + SMMU. + ** Deprecated properties: - mmu-masters (deprecated in favour of the generic "iommus" binding) : @@ -137,3 +163,20 @@ conditions. iommu-map = <0 &smmu3 0 0x400>; ... }; + + /* Qcom's arm,smmu-v2 implementation */ + smmu4: iommu { + compatible = "qcom,msm8996-smmu-v2", "qcom,smmu-v2"; + reg = <0xd00000 0x10000>; + + #global-interrupts = <1>; + interrupts = <GIC_SPI 73 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 320 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 321 IRQ_TYPE_LEVEL_HIGH>; + #iommu-cells = <1>; + power-domains = <&mmcc MDSS_GDSC>; + + clocks = <&mmcc SMMU_MDP_AXI_CLK>, + <&mmcc SMMU_MDP_AHB_CLK>; + clock-names = "bus", "iface"; + }; diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index c7e924d553bd..721cf1291f85 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -119,6 +119,7 @@ enum arm_smmu_implementation { GENERIC_SMMU, ARM_MMU500, CAVIUM_SMMUV2, + QCOM_SMMUV2, }; struct arm_smmu_s2cr { @@ -1950,6 +1951,17 @@ struct arm_smmu_match_data { ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500); ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2); +static const char * const qcom_smmuv2_clks[] = { + "bus", "iface", +}; + +static const struct arm_smmu_match_data qcom_smmuv2 = { + .version = ARM_SMMU_V2, + .model = QCOM_SMMUV2, + .clks = qcom_smmuv2_clks, + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks), +}; + static const struct of_device_id arm_smmu_of_match[] = { { .compatible = "arm,smmu-v1", .data = &smmu_generic_v1 }, { .compatible = "arm,smmu-v2", .data = &smmu_generic_v2 }, @@ -1957,6 +1969,7 @@ struct arm_smmu_match_data { { .compatible = "arm,mmu-401", .data = &arm_mmu401 }, { .compatible = "arm,mmu-500", .data = &arm_mmu500 }, { .compatible = "cavium,smmu-v2", .data = &cavium_smmuv2 }, + { .compatible = "qcom,smmu-v2", .data = &qcom_smmuv2 }, { }, }; MODULE_DEVICE_TABLE(of, arm_smmu_of_match); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply related [flat|nested] 53+ messages in thread
* [PATCH v8 5/6] iommu/arm-smmu: Add support for qcom,smmu-v2 variant 2018-02-07 10:31 ` [PATCH v7 5/6] iommu/arm-smmu: Add support for qcom, smmu-v2 variant Vivek Gautam @ 2018-02-09 10:57 ` Vivek Gautam 2018-02-13 8:57 ` Tomasz Figa 0 siblings, 1 reply; 53+ messages in thread From: Vivek Gautam @ 2018-02-09 10:57 UTC (permalink / raw) To: joro, robh+dt, mark.rutland, rjw, robin.murphy, will.deacon, robdclark, iommu, devicetree, linux-kernel, linux-pm, dri-devel, freedreno Cc: alex.williamson, gregkh, sboyd, sricharan, m.szyprowski, architt, airlied, linux-arm-msm, vivek.gautam qcom,smmu-v2 is an arm,smmu-v2 implementation with specific clock and power requirements. This smmu core is used with multiple masters on msm8996, viz. mdss, video, etc. Add bindings for the same. Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> Reviewed-by: Rob Herring <robh@kernel.org> --- Changes in v8: - Added the missing IOMMU_OF_DECLARE declaration for "qcom,smmu-v2" .../devicetree/bindings/iommu/arm,smmu.txt | 43 ++++++++++++++++++++++ drivers/iommu/arm-smmu.c | 14 +++++++ 2 files changed, 57 insertions(+) diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt index 8a6ffce12af5..169222ae2706 100644 --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt @@ -17,10 +17,19 @@ conditions. "arm,mmu-401" "arm,mmu-500" "cavium,smmu-v2" + "qcom,<soc>-smmu-v2", "qcom,smmu-v2" depending on the particular implementation and/or the version of the architecture implemented. + A number of Qcom SoCs use qcom,smmu-v2 version of the IP. + "qcom,<soc>-smmu-v2" represents a soc specific compatible + string that should be present along with the "qcom,smmu-v2" + to facilitate SoC specific clocks/power connections and to + address specific bug fixes. + An example string would be - + "qcom,msm8996-smmu-v2", "qcom,smmu-v2". + - reg : Base address and size of the SMMU. - #global-interrupts : The number of global interrupts exposed by the @@ -71,6 +80,23 @@ conditions. or using stream matching with #iommu-cells = <2>, and may be ignored if present in such cases. +- clock-names: Should be "bus", and "iface" for "qcom,smmu-v2" + implementation. + + "bus" clock for "qcom,smmu-v2" is required for downstream + bus access and for the smmu ptw. + + "iface" clock is required to access smmu's registers through + the TCU's programming interface. + +- clocks: Phandles for respective clocks described by clock-names. + +- power-domains: Phandles to SMMU's power domain specifier. This is + required even if SMMU belongs to the master's power + domain, as the SMMU will have to be enabled and + accessed before master gets enabled and linked to its + SMMU. + ** Deprecated properties: - mmu-masters (deprecated in favour of the generic "iommus" binding) : @@ -137,3 +163,20 @@ conditions. iommu-map = <0 &smmu3 0 0x400>; ... }; + + /* Qcom's arm,smmu-v2 implementation */ + smmu4: iommu { + compatible = "qcom,msm8996-smmu-v2", "qcom,smmu-v2"; + reg = <0xd00000 0x10000>; + + #global-interrupts = <1>; + interrupts = <GIC_SPI 73 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 320 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 321 IRQ_TYPE_LEVEL_HIGH>; + #iommu-cells = <1>; + power-domains = <&mmcc MDSS_GDSC>; + + clocks = <&mmcc SMMU_MDP_AXI_CLK>, + <&mmcc SMMU_MDP_AHB_CLK>; + clock-names = "bus", "iface"; + }; diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index c7e924d553bd..40da3f251acf 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -119,6 +119,7 @@ enum arm_smmu_implementation { GENERIC_SMMU, ARM_MMU500, CAVIUM_SMMUV2, + QCOM_SMMUV2, }; struct arm_smmu_s2cr { @@ -1950,6 +1951,17 @@ struct arm_smmu_match_data { ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500); ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2); +static const char * const qcom_smmuv2_clks[] = { + "bus", "iface", +}; + +static const struct arm_smmu_match_data qcom_smmuv2 = { + .version = ARM_SMMU_V2, + .model = QCOM_SMMUV2, + .clks = qcom_smmuv2_clks, + .num_clks = ARRAY_SIZE(qcom_smmuv2_clks), +}; + static const struct of_device_id arm_smmu_of_match[] = { { .compatible = "arm,smmu-v1", .data = &smmu_generic_v1 }, { .compatible = "arm,smmu-v2", .data = &smmu_generic_v2 }, @@ -1957,6 +1969,7 @@ struct arm_smmu_match_data { { .compatible = "arm,mmu-401", .data = &arm_mmu401 }, { .compatible = "arm,mmu-500", .data = &arm_mmu500 }, { .compatible = "cavium,smmu-v2", .data = &cavium_smmuv2 }, + { .compatible = "qcom,smmu-v2", .data = &qcom_smmuv2 }, { }, }; MODULE_DEVICE_TABLE(of, arm_smmu_of_match); @@ -2319,6 +2332,7 @@ static int __maybe_unused arm_smmu_runtime_suspend(struct device *dev) IOMMU_OF_DECLARE(arm_mmu401, "arm,mmu-401"); IOMMU_OF_DECLARE(arm_mmu500, "arm,mmu-500"); IOMMU_OF_DECLARE(cavium_smmuv2, "cavium,smmu-v2"); +IOMMU_OF_DECLARE(qcom_smmuv2, "qcom,smmu-v2"); MODULE_DESCRIPTION("IOMMU API for ARM architected SMMU implementations"); MODULE_AUTHOR("Will Deacon <will.deacon@arm.com>"); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v8 5/6] iommu/arm-smmu: Add support for qcom,smmu-v2 variant 2018-02-09 10:57 ` [PATCH v8 5/6] iommu/arm-smmu: Add support for qcom,smmu-v2 variant Vivek Gautam @ 2018-02-13 8:57 ` Tomasz Figa 0 siblings, 0 replies; 53+ messages in thread From: Tomasz Figa @ 2018-02-13 8:57 UTC (permalink / raw) To: Vivek Gautam, Rob Herring Cc: list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, Mark Rutland, Rafael J. Wysocki, Robin Murphy, Will Deacon, Rob Clark Hi Vivek, Thanks for the patch. Please see my comments inline. On Fri, Feb 9, 2018 at 7:57 PM, Vivek Gautam <vivek.gautam@codeaurora.org> wrote: > qcom,smmu-v2 is an arm,smmu-v2 implementation with specific > clock and power requirements. This smmu core is used with > multiple masters on msm8996, viz. mdss, video, etc. > Add bindings for the same. > > Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> > Reviewed-by: Rob Herring <robh@kernel.org> > --- > > Changes in v8: > - Added the missing IOMMU_OF_DECLARE declaration for "qcom,smmu-v2" > > .../devicetree/bindings/iommu/arm,smmu.txt | 43 ++++++++++++++++++++++ > drivers/iommu/arm-smmu.c | 14 +++++++ > 2 files changed, 57 insertions(+) > > diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt b/Documentation/devicetree/bindings/iommu/arm,smmu.txt > index 8a6ffce12af5..169222ae2706 100644 > --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt > +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt > @@ -17,10 +17,19 @@ conditions. > "arm,mmu-401" > "arm,mmu-500" > "cavium,smmu-v2" > + "qcom,<soc>-smmu-v2", "qcom,smmu-v2" > > depending on the particular implementation and/or the > version of the architecture implemented. > > + A number of Qcom SoCs use qcom,smmu-v2 version of the IP. > + "qcom,<soc>-smmu-v2" represents a soc specific compatible > + string that should be present along with the "qcom,smmu-v2" > + to facilitate SoC specific clocks/power connections and to > + address specific bug fixes. > + An example string would be - > + "qcom,msm8996-smmu-v2", "qcom,smmu-v2". Hmm, I remember that for <soc> and similar wildcards we required explicitly listing allowed values. Rob, has it changed since I stumbled upon such thing last time (or I just got it wrong before)? > + > - reg : Base address and size of the SMMU. > > - #global-interrupts : The number of global interrupts exposed by the > @@ -71,6 +80,23 @@ conditions. > or using stream matching with #iommu-cells = <2>, and > may be ignored if present in such cases. > > +- clock-names: Should be "bus", and "iface" for "qcom,smmu-v2" > + implementation. > + > + "bus" clock for "qcom,smmu-v2" is required for downstream > + bus access and for the smmu ptw. > + > + "iface" clock is required to access smmu's registers through > + the TCU's programming interface. nit: Could we have it in a bit more structured way? E.g. - clock-names: List of names of clocks input to the device. The required list depends on particular implementation and is as follows: - for "qcom,smmu-v2": - "bus": clock required for downstream bus access and for the smmu ptw, - "iface": clock required to access smmu's registers through the TCU's programming interface. - unspecified for other implementations. (+/- wrapping) > + > +- clocks: Phandles for respective clocks described by clock-names. Phandle is just a pointer to another node. Clocks are however specified using a phandle and a number of arguments, depending on the clock controller. I'd change it to: - clocks: Specifiers for all clocks listed in the clock-names property, as per generic clock bindings. > + > +- power-domains: Phandles to SMMU's power domain specifier. This is > + required even if SMMU belongs to the master's power > + domain, as the SMMU will have to be enabled and > + accessed before master gets enabled and linked to its > + SMMU. >From DT point of view, the relationship of SMMU belonging to a master device doesn't exist. The power-domains property needs to be properly specified for all devices within power domains represented in DT, with an exception of the case when the parent-child relationship is explicitly stated in DT by child devices being represented by child nodes of the parent device node. - power-domains: Specifiers for power domains required to be powered on for the SMMU to operate, as per generic power domain bindings. Best regards, Tomasz ^ permalink raw reply [flat|nested] 53+ messages in thread
* [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <1517999482-17317-1-git-send-email-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> ` (4 preceding siblings ...) 2018-02-07 10:31 ` [PATCH v7 5/6] iommu/arm-smmu: Add support for qcom, smmu-v2 variant Vivek Gautam @ 2018-02-07 10:31 ` Vivek Gautam 2018-02-13 9:10 ` Tomasz Figa 5 siblings, 1 reply; 53+ messages in thread From: Vivek Gautam @ 2018-02-07 10:31 UTC (permalink / raw) To: joro-zLv9SwRftAIdnm+yROfE0A, robh+dt-DgEjT+Ai2ygdnm+yROfE0A, mark.rutland-5wv7dgnIgG8, rjw-LthD3rsA81gm4RdzfppkhA, robin.murphy-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8, robdclark-Re5JQEeQqe8AvxtiuMwx3w, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW Cc: airlied-cv59FeDIM0c, gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, sboyd-sgV2jX0FEOL9JmXXK+q4OQ, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA While handling the concerned iommu, there should not be a need to power control the drm devices from iommu interface. If these drm devices need to be powered around this time, the respective drivers should take care of this. Replace the pm_runtime_get/put_sync(<drm_device>) with pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up the connected iommu through the device link interface. In case the device link is not setup these get/put_suppliers() calls will be a no-op, and the iommu driver should take care of powering on its devices accordingly. Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> --- drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c index b23d33622f37..1ab629bbee69 100644 --- a/drivers/gpu/drm/msm/msm_iommu.c +++ b/drivers/gpu/drm/msm/msm_iommu.c @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, struct msm_iommu *iommu = to_msm_iommu(mmu); int ret; - pm_runtime_get_sync(mmu->dev); + pm_runtime_get_suppliers(mmu->dev); ret = iommu_attach_device(iommu->domain, mmu->dev); - pm_runtime_put_sync(mmu->dev); + pm_runtime_put_suppliers(mmu->dev); return ret; } @@ -52,9 +52,9 @@ static void msm_iommu_detach(struct msm_mmu *mmu, const char * const *names, { struct msm_iommu *iommu = to_msm_iommu(mmu); - pm_runtime_get_sync(mmu->dev); + pm_runtime_get_suppliers(mmu->dev); iommu_detach_device(iommu->domain, mmu->dev); - pm_runtime_put_sync(mmu->dev); + pm_runtime_put_suppliers(mmu->dev); } static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova, @@ -63,9 +63,9 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova, struct msm_iommu *iommu = to_msm_iommu(mmu); size_t ret; -// pm_runtime_get_sync(mmu->dev); + pm_runtime_get_suppliers(mmu->dev); ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, prot); -// pm_runtime_put_sync(mmu->dev); + pm_runtime_put_suppliers(mmu->dev); WARN_ON(ret < 0); return (ret == len) ? 0 : -EINVAL; @@ -76,9 +76,9 @@ static int msm_iommu_unmap(struct msm_mmu *mmu, uint64_t iova, { struct msm_iommu *iommu = to_msm_iommu(mmu); - pm_runtime_get_sync(mmu->dev); + pm_runtime_get_suppliers(mmu->dev); iommu_unmap(iommu->domain, iova, len); - pm_runtime_put_sync(mmu->dev); + pm_runtime_put_suppliers(mmu->dev); return 0; } -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 53+ messages in thread
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers 2018-02-07 10:31 ` [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers Vivek Gautam @ 2018-02-13 9:10 ` Tomasz Figa [not found] ` <CAAFQd5AjopiX6fDgD+mO+-+d0yj-swEnVCNvccWRBSMO+XVJkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-13 9:10 UTC (permalink / raw) To: Vivek Gautam Cc: Mark Rutland, devicetree, linux-pm, David Airlie, Will Deacon, Rafael J. Wysocki, list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>, , dri-devel, linux-kernel, Rob Herring, Greg KH, freedreno, Robin Murphy, sboyd, linux-arm-msm Hi Vivek, Thanks for the patch. Please see my comments inline. On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam <vivek.gautam@codeaurora.org> wrote: > While handling the concerned iommu, there should not be a > need to power control the drm devices from iommu interface. > If these drm devices need to be powered around this time, > the respective drivers should take care of this. > > Replace the pm_runtime_get/put_sync(<drm_device>) with > pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up > the connected iommu through the device link interface. > In case the device link is not setup these get/put_suppliers() > calls will be a no-op, and the iommu driver should take care of > powering on its devices accordingly. > > Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> > --- > drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- > 1 file changed, 8 insertions(+), 8 deletions(-) > > diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c > index b23d33622f37..1ab629bbee69 100644 > --- a/drivers/gpu/drm/msm/msm_iommu.c > +++ b/drivers/gpu/drm/msm/msm_iommu.c > @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, > struct msm_iommu *iommu = to_msm_iommu(mmu); > int ret; > > - pm_runtime_get_sync(mmu->dev); > + pm_runtime_get_suppliers(mmu->dev); > ret = iommu_attach_device(iommu->domain, mmu->dev); > - pm_runtime_put_sync(mmu->dev); > + pm_runtime_put_suppliers(mmu->dev); For me, it looks like a wrong place to handle runtime PM of IOMMU here. iommu_attach_device() calls into IOMMU driver's attach_device() callback and that's where necessary runtime PM gets should happen, if any. In other words, driver A (MSM DRM driver) shouldn't be dealing with power state of device controlled by driver B (ARM SMMU). This is also important for the reasons I stated in my comments to "[PATCH v7 1/6] base: power: runtime: Export pm_runtime_get/put_suppliers". Quoting for everyone's convenience: >> There are however cases in which the consumer wants to power-on >> the supplier, but not itself. >> E.g., A Graphics or multimedia driver wants to power-on the SMMU >> to unmap a buffer and finish the TLB operations without powering >> on itself. > >This sounds strange to me. If the SMMU is powered down, wouldn't the >TLB lose its contents as well (and so no flushing needed)? > >Other than that, what kind of hardware operations would be needed >besides just updating the page tables from the CPU? > In other words, the SMMU driver can deal with hardware state based on return value of pm_runtime_get_sync() or pm_runtime_get_if_in_use() and decide whether some operations are necessary or not, e.g. - a state restore is necessary if the domain was powered off, but we are bringing the master on, - a flush may not be required when (un)mapping with the domain powered off, - etc. Best regards, Tomasz _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5AjopiX6fDgD+mO+-+d0yj-swEnVCNvccWRBSMO+XVJkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAAFQd5AjopiX6fDgD+mO+-+d0yj-swEnVCNvccWRBSMO+XVJkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-13 16:42 ` Jordan Crouse 2018-02-14 3:31 ` Tomasz Figa 2018-02-13 18:03 ` Rob Clark 1 sibling, 1 reply; 53+ messages in thread From: Jordan Crouse @ 2018-02-13 16:42 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, David Airlie, Rafael J. Wysocki, linux-arm-msm-u79uwXL29TY76Z2rM5mHXA, Will Deacon, linux-kernel-u79uwXL29TY76Z2rM5mHXA, dri-devel, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, , Rob Herring, Vivek Gautam, Greg KH, freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, sboyd-sgV2jX0FEOL9JmXXK+q4OQ, Robin Murphy On Tue, Feb 13, 2018 at 06:10:38PM +0900, Tomasz Figa wrote: > Hi Vivek, > > Thanks for the patch. Please see my comments inline. > > On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam > <vivek.gautam@codeaurora.org> wrote: > > While handling the concerned iommu, there should not be a > > need to power control the drm devices from iommu interface. > > If these drm devices need to be powered around this time, > > the respective drivers should take care of this. > > > > Replace the pm_runtime_get/put_sync(<drm_device>) with > > pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up > > the connected iommu through the device link interface. > > In case the device link is not setup these get/put_suppliers() > > calls will be a no-op, and the iommu driver should take care of > > powering on its devices accordingly. > > > > Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> > > --- > > drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- > > 1 file changed, 8 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c > > index b23d33622f37..1ab629bbee69 100644 > > --- a/drivers/gpu/drm/msm/msm_iommu.c > > +++ b/drivers/gpu/drm/msm/msm_iommu.c > > @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, > > struct msm_iommu *iommu = to_msm_iommu(mmu); > > int ret; > > > > - pm_runtime_get_sync(mmu->dev); > > + pm_runtime_get_suppliers(mmu->dev); > > ret = iommu_attach_device(iommu->domain, mmu->dev); > > - pm_runtime_put_sync(mmu->dev); > > + pm_runtime_put_suppliers(mmu->dev); > > For me, it looks like a wrong place to handle runtime PM of IOMMU > here. iommu_attach_device() calls into IOMMU driver's attach_device() > callback and that's where necessary runtime PM gets should happen, if > any. In other words, driver A (MSM DRM driver) shouldn't be dealing > with power state of device controlled by driver B (ARM SMMU). This whole thing is confused by the fact that on MSM the GPU and the GPU IOMMU share some of the same clocks and power rail so turning on the GPU also turned on the IOMMU register banks by extension. But if we put that aside the question is who should be responsible for controlling the power in this relationship and there are several good reasons to leave it up to the client device. The most important reason is when we move to the per-instance model where the GPU self-programmings the SMMU registers. In that case, the driver will need to make sure that the SMMU is powered up before submitting the command and then removing the power vote when the commands are done to save energy. Additionally, there might be legitimate reasons in the driver to batch operations - you may wish to attach the device and then map several global buffers immediately - having driver side control prevents several unneeded power transitions. Perhaps the right answer is to do both - allow for the driver to enable the supplier but also do the right power operations at the appropriately places in the IOMMU driver. > This is also important for the reasons I stated in my comments to > "[PATCH v7 1/6] base: power: runtime: Export > pm_runtime_get/put_suppliers". Quoting for everyone's convenience: > > >> There are however cases in which the consumer wants to power-on > >> the supplier, but not itself. > >> E.g., A Graphics or multimedia driver wants to power-on the SMMU > >> to unmap a buffer and finish the TLB operations without powering > >> on itself. > > > >This sounds strange to me. If the SMMU is powered down, wouldn't the > >TLB lose its contents as well (and so no flushing needed)? > > > >Other than that, what kind of hardware operations would be needed > >besides just updating the page tables from the CPU? > > > In other words, the SMMU driver can deal with hardware state based on > return value of pm_runtime_get_sync() or pm_runtime_get_if_in_use() > and decide whether some operations are necessary or not, e.g. > - a state restore is necessary if the domain was powered off, but we > are bringing the master on, > - a flush may not be required when (un)mapping with the domain powered off, > - etc. I agree that there is probably some advanced logic that we can do to conclusively figure out the state of the hardware and improve the behavior. I would love to see the SMMU driver get smarter but for the moment we can't trust it and so we need to force the behavior from the GPU driver. The current code works for a5x and earlier but on sdm845 we can (no longer) treat the GPU and the SMMU as the same device for power purposes so we need this code. If at some point in the future we can start to selectively remove the supplier calls I wouldn't mind one bit. Jordan -- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers 2018-02-13 16:42 ` Jordan Crouse @ 2018-02-14 3:31 ` Tomasz Figa [not found] ` <CAAFQd5CjQRFATfh-mRQv5J=WefYuxBVTkk=Ju09FoqA-Or5Cvg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-14 3:31 UTC (permalink / raw) To: Tomasz Figa, Vivek Gautam, Mark Rutland, devicetree, linux-pm, David Airlie, Will Deacon, Rafael J. Wysocki, list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, dri-devel, Linux Kernel Mailing List, Rob Herring, Greg KH, freedreno, Robin Murphy, Stephen Boyd, linux-arm-msm Hi Jordan, On Wed, Feb 14, 2018 at 1:42 AM, Jordan Crouse <jcrouse@codeaurora.org> wrote: > On Tue, Feb 13, 2018 at 06:10:38PM +0900, Tomasz Figa wrote: >> Hi Vivek, >> >> Thanks for the patch. Please see my comments inline. >> >> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >> <vivek.gautam@codeaurora.org> wrote: >> > While handling the concerned iommu, there should not be a >> > need to power control the drm devices from iommu interface. >> > If these drm devices need to be powered around this time, >> > the respective drivers should take care of this. >> > >> > Replace the pm_runtime_get/put_sync(<drm_device>) with >> > pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up >> > the connected iommu through the device link interface. >> > In case the device link is not setup these get/put_suppliers() >> > calls will be a no-op, and the iommu driver should take care of >> > powering on its devices accordingly. >> > >> > Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >> > --- >> > drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- >> > 1 file changed, 8 insertions(+), 8 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c >> > index b23d33622f37..1ab629bbee69 100644 >> > --- a/drivers/gpu/drm/msm/msm_iommu.c >> > +++ b/drivers/gpu/drm/msm/msm_iommu.c >> > @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, >> > struct msm_iommu *iommu = to_msm_iommu(mmu); >> > int ret; >> > >> > - pm_runtime_get_sync(mmu->dev); >> > + pm_runtime_get_suppliers(mmu->dev); >> > ret = iommu_attach_device(iommu->domain, mmu->dev); >> > - pm_runtime_put_sync(mmu->dev); >> > + pm_runtime_put_suppliers(mmu->dev); >> >> For me, it looks like a wrong place to handle runtime PM of IOMMU >> here. iommu_attach_device() calls into IOMMU driver's attach_device() >> callback and that's where necessary runtime PM gets should happen, if >> any. In other words, driver A (MSM DRM driver) shouldn't be dealing >> with power state of device controlled by driver B (ARM SMMU). > > This whole thing is confused by the fact that on MSM the GPU and the GPU IOMMU > share some of the same clocks and power rail so turning on the GPU also > turned on the IOMMU register banks by extension. This is surprisingly not a very surprising case. Exactly the same can be seen on Rockchip SoCs and we're solving the problem using the solution I suggested. In fact, my suggestions to this thread are based on the design we chose for Rockchip, due to the high level of similarity (+/- the GPU directly programming IOMMU registers, which is not present there, but AFAICT it doesn't pose a problem here). > > But if we put that aside the question is who should be responsible for > controlling the power in this relationship and there are several good reasons to > leave it up to the client device. The most important reason is when we move to > the per-instance model where the GPU self-programmings the SMMU registers. In > that case, the driver will need to make sure that the SMMU is powered up before > submitting the command and then removing the power vote when the commands > are done to save energy. I might need more insight on what's going on in your hardware, but with my current understanding I'd argue that that is not right, because: - When submitting commands to the GPU, the GPU driver will pm_runtime_get_sync() on the GPU device, which will automatically do the same on all the linked suppliers, which would also include the SMMU itself. The role of device links here is exactly that the GPU driver doesn't have to care which other devices need to be brought up. - When the GPU is operating, the SMMU power must be supplied anyway, because it needs to be doing the translations, right? Note that by "power" I really mean the physical power supply in the SoC, e.g. as for a power domain. The runtime PM API in its current form (e.g. binary off or on operation) is unsuitable for managing other things, such as clocks (and there is ongoing work on improving it, e.g. by adding support for multiple power states). ^^ The above would be actually guaranteed by your hardware design, where SMMU and GPU share the power domain and clocks. (We used to rely on this in old downstream implementation of Rockchip IOMMU and master drivers in Chromium OS kernel, before we moved to handling the clocks explicitly in the IOMMU driver and properly using device links to manage the power domain and state restoration.) > > Additionally, there might be legitimate reasons in the driver to batch > operations - you may wish to attach the device and then map several global > buffers immediately - having driver side control prevents several unneeded power > transitions. As I mentioned before, these operations wouldn't normally need any power transitions, since mapping with the TLB powered down boils down to just updating the page tables in memory. However, as Robin mentioned before, there might be some hardware factors, such as TLB being powered separately (or retaining contents in some other way), where this wouldn't be ensured indeed. Still, that's where runtime PM autosuspend feature (i.e. delayed suspend) comes to the rescue, with the advantage of handling the cases when the master driver receives map/unmap requests not batched (but maybe a slight drawback in terms of the suspend not happening instantly and losing some power, but it's about power domains, so mainly leakage current, isn't it?) > > Perhaps the right answer is to do both - allow for the driver to enable the > supplier but also do the right power operations at the appropriately places in > the IOMMU driver. > >> This is also important for the reasons I stated in my comments to >> "[PATCH v7 1/6] base: power: runtime: Export >> pm_runtime_get/put_suppliers". Quoting for everyone's convenience: >> >> >> There are however cases in which the consumer wants to power-on >> >> the supplier, but not itself. >> >> E.g., A Graphics or multimedia driver wants to power-on the SMMU >> >> to unmap a buffer and finish the TLB operations without powering >> >> on itself. >> > >> >This sounds strange to me. If the SMMU is powered down, wouldn't the >> >TLB lose its contents as well (and so no flushing needed)? >> > > >> >Other than that, what kind of hardware operations would be needed >> >besides just updating the page tables from the CPU? >> > > >> In other words, the SMMU driver can deal with hardware state based on >> return value of pm_runtime_get_sync() or pm_runtime_get_if_in_use() >> and decide whether some operations are necessary or not, e.g. >> - a state restore is necessary if the domain was powered off, but we >> are bringing the master on, >> - a flush may not be required when (un)mapping with the domain powered off, >> - etc. > > I agree that there is probably some advanced logic that we can do to > conclusively figure out the state of the hardware and improve the behavior. > I would love to see the SMMU driver get smarter but for the moment we can't > trust it and so we need to force the behavior from the GPU driver. The current > code works for a5x and earlier but on sdm845 we can (no longer) treat the GPU > and the SMMU as the same device for power purposes so we need this code. Hmm, you've lost me there. Above you mention that "on MSM the GPU and the GPU IOMMU share some of the same clocks and power rail". Is this no longer the case for sdm845? If so, would you mind shedding a bit more light on how this looks there? Best regards, Tomasz ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5CjQRFATfh-mRQv5J=WefYuxBVTkk=Ju09FoqA-Or5Cvg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAAFQd5CjQRFATfh-mRQv5J=WefYuxBVTkk=Ju09FoqA-Or5Cvg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-14 15:48 ` Jordan Crouse [not found] ` <20180214154850.GA25422-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Jordan Crouse @ 2018-02-14 15:48 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, David Airlie, Rafael J. Wysocki, linux-arm-msm, Will Deacon, Linux Kernel Mailing List, dri-devel, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, , Rob Herring, Greg KH, freedreno, Stephen Boyd On Wed, Feb 14, 2018 at 12:31:29PM +0900, Tomasz Figa wrote: > Hi Jordan, > > On Wed, Feb 14, 2018 at 1:42 AM, Jordan Crouse <jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: > > On Tue, Feb 13, 2018 at 06:10:38PM +0900, Tomasz Figa wrote: > >> Hi Vivek, > >> > >> Thanks for the patch. Please see my comments inline. > >> > >> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam > >> <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: > >> > While handling the concerned iommu, there should not be a > >> > need to power control the drm devices from iommu interface. > >> > If these drm devices need to be powered around this time, > >> > the respective drivers should take care of this. > >> > > >> > Replace the pm_runtime_get/put_sync(<drm_device>) with > >> > pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up > >> > the connected iommu through the device link interface. > >> > In case the device link is not setup these get/put_suppliers() > >> > calls will be a no-op, and the iommu driver should take care of > >> > powering on its devices accordingly. > >> > > >> > Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> > >> > --- > >> > drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- > >> > 1 file changed, 8 insertions(+), 8 deletions(-) > >> > > >> > diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c > >> > index b23d33622f37..1ab629bbee69 100644 > >> > --- a/drivers/gpu/drm/msm/msm_iommu.c > >> > +++ b/drivers/gpu/drm/msm/msm_iommu.c > >> > @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, > >> > struct msm_iommu *iommu = to_msm_iommu(mmu); > >> > int ret; > >> > > >> > - pm_runtime_get_sync(mmu->dev); > >> > + pm_runtime_get_suppliers(mmu->dev); > >> > ret = iommu_attach_device(iommu->domain, mmu->dev); > >> > - pm_runtime_put_sync(mmu->dev); > >> > + pm_runtime_put_suppliers(mmu->dev); > >> > >> For me, it looks like a wrong place to handle runtime PM of IOMMU > >> here. iommu_attach_device() calls into IOMMU driver's attach_device() > >> callback and that's where necessary runtime PM gets should happen, if > >> any. In other words, driver A (MSM DRM driver) shouldn't be dealing > >> with power state of device controlled by driver B (ARM SMMU). > > > > This whole thing is confused by the fact that on MSM the GPU and the GPU IOMMU > > share some of the same clocks and power rail so turning on the GPU also > > turned on the IOMMU register banks by extension. > > This is surprisingly not a very surprising case. Exactly the same can > be seen on Rockchip SoCs and we're solving the problem using the > solution I suggested. In fact, my suggestions to this thread are based > on the design we chose for Rockchip, due to the high level of > similarity (+/- the GPU directly programming IOMMU registers, which is > not present there, but AFAICT it doesn't pose a problem here). > > > > > But if we put that aside the question is who should be responsible for > > controlling the power in this relationship and there are several good reasons to > > leave it up to the client device. The most important reason is when we move to > > the per-instance model where the GPU self-programmings the SMMU registers. In > > that case, the driver will need to make sure that the SMMU is powered up before > > submitting the command and then removing the power vote when the commands > > are done to save energy. > > I might need more insight on what's going on in your hardware, but > with my current understanding I'd argue that that is not right, > because: > > - When submitting commands to the GPU, the GPU driver will > pm_runtime_get_sync() on the GPU device, which will automatically do > the same on all the linked suppliers, which would also include the > SMMU itself. The role of device links here is exactly that the GPU > driver doesn't have to care which other devices need to be brought up. This is true. Assuming that the device link works correctly we would not need to explicitly power the SMMU which makes my point entirely moot. > - When the GPU is operating, the SMMU power must be supplied anyway, > because it needs to be doing the translations, right? Note that by > "power" I really mean the physical power supply in the SoC, e.g. as > for a power domain. The runtime PM API in its current form (e.g. > binary off or on operation) is unsuitable for managing other things, > such as clocks (and there is ongoing work on improving it, e.g. by > adding support for multiple power states). As others have pointed out, the register banks and the translation unit are powered separately (or at least, clocked separately). > ^^ The above would be actually guaranteed by your hardware design, > where SMMU and GPU share the power domain and clocks. (We used to rely > on this in old downstream implementation of Rockchip IOMMU and master > drivers in Chromium OS kernel, before we moved to handling the clocks > explicitly in the IOMMU driver and properly using device links to > manage the power domain and state restoration.) I wouldn't call it a guarantee. I would instead say that it works by a happy coincidence that I don't think we should depend on. > > > > Additionally, there might be legitimate reasons in the driver to batch > > operations - you may wish to attach the device and then map several global > > buffers immediately - having driver side control prevents several unneeded power > > transitions. > > As I mentioned before, these operations wouldn't normally need any > power transitions, since mapping with the TLB powered down boils down > to just updating the page tables in memory. However, as Robin > mentioned before, there might be some hardware factors, such as TLB > being powered separately (or retaining contents in some other way), > where this wouldn't be ensured indeed. > > Still, that's where runtime PM autosuspend feature (i.e. delayed > suspend) comes to the rescue, with the advantage of handling the cases > when the master driver receives map/unmap requests not batched (but > maybe a slight drawback in terms of the suspend not happening > instantly and losing some power, but it's about power domains, so > mainly leakage current, isn't it?) > > > > Perhaps the right answer is to do both - allow for the driver to enable the > > supplier but also do the right power operations at the appropriately places in > > the IOMMU driver. > > > >> This is also important for the reasons I stated in my comments to > >> "[PATCH v7 1/6] base: power: runtime: Export > >> pm_runtime_get/put_suppliers". Quoting for everyone's convenience: > >> > >> >> There are however cases in which the consumer wants to power-on > >> >> the supplier, but not itself. > >> >> E.g., A Graphics or multimedia driver wants to power-on the SMMU > >> >> to unmap a buffer and finish the TLB operations without powering > >> >> on itself. > >> > > >> >This sounds strange to me. If the SMMU is powered down, wouldn't the > >> >TLB lose its contents as well (and so no flushing needed)? > >> > > > > >> >Other than that, what kind of hardware operations would be needed > >> >besides just updating the page tables from the CPU? > >> > > > > >> In other words, the SMMU driver can deal with hardware state based on > >> return value of pm_runtime_get_sync() or pm_runtime_get_if_in_use() > >> and decide whether some operations are necessary or not, e.g. > >> - a state restore is necessary if the domain was powered off, but we > >> are bringing the master on, > >> - a flush may not be required when (un)mapping with the domain powered off, > >> - etc. > > > > I agree that there is probably some advanced logic that we can do to > > conclusively figure out the state of the hardware and improve the behavior. > > I would love to see the SMMU driver get smarter but for the moment we can't > > trust it and so we need to force the behavior from the GPU driver. The current > > code works for a5x and earlier but on sdm845 we can (no longer) treat the GPU > > and the SMMU as the same device for power purposes so we need this code. > > Hmm, you've lost me there. Above you mention that "on MSM the GPU and > the GPU IOMMU share some of the same clocks and power rail". Is this > no longer the case for sdm845? If so, would you mind shedding a bit > more light on how this looks there? Sure. I've sent out the code, but it can be confusing, so I'll try to explain it a little better. On a5xx and earlier the GPU power/clocks were directly controlled by the CPU so the pm resume consisted of a handful of clock controls plus the domain(s) controlled by genpd. Starting on sdm845 we have added a new integrated microcontroller called the GMU which takes over power control for the GPU. The GMU runs in real time and it can bring the GPU power up and down very quickly - even quickly enough to collapse between frames. If done right this can save significant leakage. The problem is of course that the GMU is a fully featured processor in its own right so its not longer a matter of just turning on clocks and rails. We need to boot it, load the microcode, establish IPC and so on. As you imagine,the GMU also uses the SMMU to share code with the CPU. The kicker is that the while SMMU and GPU share common clocks, the GMU does not and since from the perspective of the CPU the only device that we control is the GMU and we have to treat the SMMU as a truly separate device and thats how we get to where we are. But as you said, as long as we have the device link correctly set up, I think we might just be able to get away with depending on the supplier chain working during pm resume. I'll test it out today and see how it goes. Thanks, Jordan -- The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <20180214154850.GA25422-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org>]
* Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <20180214154850.GA25422-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org> @ 2018-02-14 16:12 ` Rob Clark 2018-02-15 4:09 ` Tomasz Figa 0 siblings, 1 reply; 53+ messages in thread From: Rob Clark @ 2018-02-14 16:12 UTC (permalink / raw) To: Tomasz Figa, Vivek Gautam, Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, David Airlie, Will Deacon, Rafael J. Wysocki, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>,, dri-devel, Linux Kernel Mailing List, Rob Herring, Greg KH, freedreno, Robin Murphy, Stephen Boyd, linux-arm-msm On Wed, Feb 14, 2018 at 10:48 AM, Jordan Crouse <jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: > On Wed, Feb 14, 2018 at 12:31:29PM +0900, Tomasz Figa wrote: >> >> - When submitting commands to the GPU, the GPU driver will >> pm_runtime_get_sync() on the GPU device, which will automatically do >> the same on all the linked suppliers, which would also include the >> SMMU itself. The role of device links here is exactly that the GPU >> driver doesn't have to care which other devices need to be brought up. > > This is true. Assuming that the device link works correctly we would not need > to explicitly power the SMMU which makes my point entirely moot. Just to point out what motivated this patchset, the biggest problem is iommu_unmap() because that can happen when GPU is not powered on (or in the v4l2 case, because some other device dropped it's reference to the dma-buf allowing it to be free'd). Currently we pm get/put the GPU device around unmap, but it is kinda silly to boot up the GPU just to unmap a buffer. (Semi-related, I would also like to batch map/unmap's, I just haven't gotten around to implementing it yet.. but that would be another case where a single get_supplier()/put_supplier() outside of the iommu would make sense instead of pm_get/put() inside the iommu driver's ->unmap().) If you really dislike the get/put_supplier() approach, then perhaps we need iommu_pm_get()/iommu_pm_put() operations that the iommu user could use to accomplish the same thing? BR, -R -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [Freedreno] [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers 2018-02-14 16:12 ` [Freedreno] " Rob Clark @ 2018-02-15 4:09 ` Tomasz Figa [not found] ` <CAAFQd5C-9mbd3hDSvz10a1oiO0--FT-L4EpsAYcALxxUvk6Fjg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-15 4:09 UTC (permalink / raw) To: Rob Clark Cc: Mark Rutland, devicetree, Linux PM, David Airlie, Rafael J. Wysocki, linux-arm-msm, Will Deacon, Linux Kernel Mailing List, dri-devel, list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>, , Rob Herring, Vivek Gautam, Greg KH, freedreno, Stephen Boyd, Robin Murphy On Thu, Feb 15, 2018 at 1:12 AM, Rob Clark <robdclark@gmail.com> wrote: > On Wed, Feb 14, 2018 at 10:48 AM, Jordan Crouse <jcrouse@codeaurora.org> wrote: >> On Wed, Feb 14, 2018 at 12:31:29PM +0900, Tomasz Figa wrote: >>> >>> - When submitting commands to the GPU, the GPU driver will >>> pm_runtime_get_sync() on the GPU device, which will automatically do >>> the same on all the linked suppliers, which would also include the >>> SMMU itself. The role of device links here is exactly that the GPU >>> driver doesn't have to care which other devices need to be brought up. >> >> This is true. Assuming that the device link works correctly we would not need >> to explicitly power the SMMU which makes my point entirely moot. > > Just to point out what motivated this patchset, the biggest problem is > iommu_unmap() because that can happen when GPU is not powered on (or > in the v4l2 case, because some other device dropped it's reference to > the dma-buf allowing it to be free'd). Currently we pm get/put the > GPU device around unmap, but it is kinda silly to boot up the GPU just > to unmap a buffer. Note that in V4L2 both mapping and unmapping can happen completely without involving the driver. So AFAICT the approach being implemented by this patchset will not work, because there will be no one to power up the IOMMU before the operation. Moreover, there are platforms for which there is no reason to power up the IOMMU just for map/unmap, because the hardware state is lost anyway and the only real work needed is updating the page tables in memory. (I feel like this is actually true for most of the platforms in the wild, but this is based purely on the not so small number of platforms I worked with, haven't bothered looking for more general evidence.) > > (Semi-related, I would also like to batch map/unmap's, I just haven't > gotten around to implementing it yet.. but that would be another case > where a single get_supplier()/put_supplier() outside of the iommu > would make sense instead of pm_get/put() inside the iommu driver's > ->unmap().) > > If you really dislike the get/put_supplier() approach, then perhaps we > need iommu_pm_get()/iommu_pm_put() operations that the iommu user > could use to accomplish the same thing? I'm afraid this wouldn't work for V4L2 either. And I still haven't been given any evidence that the approach I'm suggesting, which relies only on existing pieces of infrastructure and which worked for both Exynos and Rockchip, including V4L2, wouldn't work for SMMU and/or QC SoCs. Best regards, Tomasz _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5C-9mbd3hDSvz10a1oiO0--FT-L4EpsAYcALxxUvk6Fjg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAAFQd5C-9mbd3hDSvz10a1oiO0--FT-L4EpsAYcALxxUvk6Fjg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-15 14:14 ` Rob Clark 0 siblings, 0 replies; 53+ messages in thread From: Rob Clark @ 2018-02-15 14:14 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, Linux PM, David Airlie, Rafael J. Wysocki, linux-arm-msm, Will Deacon, Linux Kernel Mailing List, dri-devel, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, , Rob Herring, Vivek Gautam, Greg KH, freedreno, Stephen Boyd, Robin Murphy On Wed, Feb 14, 2018 at 11:09 PM, Tomasz Figa <tfiga@chromium.org> wrote: > On Thu, Feb 15, 2018 at 1:12 AM, Rob Clark <robdclark@gmail.com> wrote: >> On Wed, Feb 14, 2018 at 10:48 AM, Jordan Crouse <jcrouse@codeaurora.org> wrote: >>> On Wed, Feb 14, 2018 at 12:31:29PM +0900, Tomasz Figa wrote: >>>> >>>> - When submitting commands to the GPU, the GPU driver will >>>> pm_runtime_get_sync() on the GPU device, which will automatically do >>>> the same on all the linked suppliers, which would also include the >>>> SMMU itself. The role of device links here is exactly that the GPU >>>> driver doesn't have to care which other devices need to be brought up. >>> >>> This is true. Assuming that the device link works correctly we would not need >>> to explicitly power the SMMU which makes my point entirely moot. >> >> Just to point out what motivated this patchset, the biggest problem is >> iommu_unmap() because that can happen when GPU is not powered on (or >> in the v4l2 case, because some other device dropped it's reference to >> the dma-buf allowing it to be free'd). Currently we pm get/put the >> GPU device around unmap, but it is kinda silly to boot up the GPU just >> to unmap a buffer. > > Note that in V4L2 both mapping and unmapping can happen completely > without involving the driver. So AFAICT the approach being implemented > by this patchset will not work, because there will be no one to power > up the IOMMU before the operation. Moreover, there are platforms for > which there is no reason to power up the IOMMU just for map/unmap, > because the hardware state is lost anyway and the only real work > needed is updating the page tables in memory. (I feel like this is > actually true for most of the platforms in the wild, but this is based > purely on the not so small number of platforms I worked with, haven't > bothered looking for more general evidence.) > At least as far as drm/msm/adreno, I'm not terribly concerned about other platforms that don't need to power up iommu. It's not really the same situation as a IP block that shows up in all different vendor's SoCs. But if you can convince Robin to go for get/put_sync() calls inside the iommu driver, I'm fine with that approach too. That is what I do in qcom_iommu already. But if not I'd like to at least solve this for some platforms if we can't solve for all. BR, -R >> >> (Semi-related, I would also like to batch map/unmap's, I just haven't >> gotten around to implementing it yet.. but that would be another case >> where a single get_supplier()/put_supplier() outside of the iommu >> would make sense instead of pm_get/put() inside the iommu driver's >> ->unmap().) >> >> If you really dislike the get/put_supplier() approach, then perhaps we >> need iommu_pm_get()/iommu_pm_put() operations that the iommu user >> could use to accomplish the same thing? > > I'm afraid this wouldn't work for V4L2 either. And I still haven't > been given any evidence that the approach I'm suggesting, which relies > only on existing pieces of infrastructure and which worked for both > Exynos and Rockchip, including V4L2, wouldn't work for SMMU and/or QC > SoCs. > > Best regards, > Tomasz _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAAFQd5AjopiX6fDgD+mO+-+d0yj-swEnVCNvccWRBSMO+XVJkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-13 16:42 ` Jordan Crouse @ 2018-02-13 18:03 ` Rob Clark 2018-02-14 1:59 ` Tomasz Figa 1 sibling, 1 reply; 53+ messages in thread From: Rob Clark @ 2018-02-13 18:03 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, David Airlie, Will Deacon, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, , Rafael J. Wysocki, Linux Kernel Mailing List, dri-devel On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tfiga@chromium.org> wrote: > Hi Vivek, > > Thanks for the patch. Please see my comments inline. > > On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam > <vivek.gautam@codeaurora.org> wrote: >> While handling the concerned iommu, there should not be a >> need to power control the drm devices from iommu interface. >> If these drm devices need to be powered around this time, >> the respective drivers should take care of this. >> >> Replace the pm_runtime_get/put_sync(<drm_device>) with >> pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up >> the connected iommu through the device link interface. >> In case the device link is not setup these get/put_suppliers() >> calls will be a no-op, and the iommu driver should take care of >> powering on its devices accordingly. >> >> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >> --- >> drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- >> 1 file changed, 8 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c >> index b23d33622f37..1ab629bbee69 100644 >> --- a/drivers/gpu/drm/msm/msm_iommu.c >> +++ b/drivers/gpu/drm/msm/msm_iommu.c >> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, >> struct msm_iommu *iommu = to_msm_iommu(mmu); >> int ret; >> >> - pm_runtime_get_sync(mmu->dev); >> + pm_runtime_get_suppliers(mmu->dev); >> ret = iommu_attach_device(iommu->domain, mmu->dev); >> - pm_runtime_put_sync(mmu->dev); >> + pm_runtime_put_suppliers(mmu->dev); > > For me, it looks like a wrong place to handle runtime PM of IOMMU > here. iommu_attach_device() calls into IOMMU driver's attach_device() > callback and that's where necessary runtime PM gets should happen, if > any. In other words, driver A (MSM DRM driver) shouldn't be dealing > with power state of device controlled by driver B (ARM SMMU). Note that we end up having to do the same, because of iommu_unmap() while DRM driver is powered off.. it might be cleaner if it was all self contained in the iommu driver, but that would make it so other drivers couldn't call iommu_unmap() from an irq handler, which is apparently something that some of them want to do.. So I'm happy with the pm_runtime_get/put_suppliers() approach as a reasonable compromise. (Perhaps specifically, attach/detach this could move inside the iommu driver, but we still need to get/put_suppliers() for unmap(), so meh) BR, -R > This is also important for the reasons I stated in my comments to > "[PATCH v7 1/6] base: power: runtime: Export > pm_runtime_get/put_suppliers". Quoting for everyone's convenience: > >>> There are however cases in which the consumer wants to power-on >>> the supplier, but not itself. >>> E.g., A Graphics or multimedia driver wants to power-on the SMMU >>> to unmap a buffer and finish the TLB operations without powering >>> on itself. >> >>This sounds strange to me. If the SMMU is powered down, wouldn't the >>TLB lose its contents as well (and so no flushing needed)? >> >>Other than that, what kind of hardware operations would be needed >>besides just updating the page tables from the CPU? >> > > In other words, the SMMU driver can deal with hardware state based on > return value of pm_runtime_get_sync() or pm_runtime_get_if_in_use() > and decide whether some operations are necessary or not, e.g. > - a state restore is necessary if the domain was powered off, but we > are bringing the master on, > - a flush may not be required when (un)mapping with the domain powered off, > - etc. > > Best regards, > Tomasz _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers 2018-02-13 18:03 ` Rob Clark @ 2018-02-14 1:59 ` Tomasz Figa [not found] ` <CAAFQd5BKRumpEfAKNF_RKS-ZZ8D671DfOz4vB2+w1SV3aG9NxQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-14 1:59 UTC (permalink / raw) To: Rob Clark Cc: Vivek Gautam, list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, Rob Herring, Mark Rutland, Rafael J. Wysocki, Robin Murphy, Will Deacon On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdclark@gmail.com> wrote: > On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tfiga@chromium.org> wrote: >> Hi Vivek, >> >> Thanks for the patch. Please see my comments inline. >> >> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >> <vivek.gautam@codeaurora.org> wrote: >>> While handling the concerned iommu, there should not be a >>> need to power control the drm devices from iommu interface. >>> If these drm devices need to be powered around this time, >>> the respective drivers should take care of this. >>> >>> Replace the pm_runtime_get/put_sync(<drm_device>) with >>> pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up >>> the connected iommu through the device link interface. >>> In case the device link is not setup these get/put_suppliers() >>> calls will be a no-op, and the iommu driver should take care of >>> powering on its devices accordingly. >>> >>> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >>> --- >>> drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- >>> 1 file changed, 8 insertions(+), 8 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c >>> index b23d33622f37..1ab629bbee69 100644 >>> --- a/drivers/gpu/drm/msm/msm_iommu.c >>> +++ b/drivers/gpu/drm/msm/msm_iommu.c >>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, >>> struct msm_iommu *iommu = to_msm_iommu(mmu); >>> int ret; >>> >>> - pm_runtime_get_sync(mmu->dev); >>> + pm_runtime_get_suppliers(mmu->dev); >>> ret = iommu_attach_device(iommu->domain, mmu->dev); >>> - pm_runtime_put_sync(mmu->dev); >>> + pm_runtime_put_suppliers(mmu->dev); >> >> For me, it looks like a wrong place to handle runtime PM of IOMMU >> here. iommu_attach_device() calls into IOMMU driver's attach_device() >> callback and that's where necessary runtime PM gets should happen, if >> any. In other words, driver A (MSM DRM driver) shouldn't be dealing >> with power state of device controlled by driver B (ARM SMMU). > > Note that we end up having to do the same, because of iommu_unmap() > while DRM driver is powered off.. it might be cleaner if it was all > self contained in the iommu driver, but that would make it so other > drivers couldn't call iommu_unmap() from an irq handler, which is > apparently something that some of them want to do.. I'd assume that runtime PM status is already guaranteed to be active when the IRQ handler is running, by some other means (e.g. pm_runtime_get_sync() called earlier, when queuing some work to the hardware). Otherwise, I'm not sure how a powered down device could trigger an IRQ. So, if the master device power is already on, suppliers should be powered on as well, thanks to device links. Best regards, Tomasz ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5BKRumpEfAKNF_RKS-ZZ8D671DfOz4vB2+w1SV3aG9NxQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAAFQd5BKRumpEfAKNF_RKS-ZZ8D671DfOz4vB2+w1SV3aG9NxQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-14 2:13 ` Rob Clark [not found] ` <CAF6AEGuNZJKtwGZ5mLfqNND2jtU+HYM11UONfAtVTzoM0QVpdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Rob Clark @ 2018-02-14 2:13 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, linux-pm-u79uwXL29TY76Z2rM5mHXA, David Airlie, Will Deacon, Rafael J. Wysocki, Linux Kernel Mailing List, dri-devel, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, , Rob Herring, Greg KH, freedreno, Stephen Boyd, linux-arm-msm On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: > On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >>> Hi Vivek, >>> >>> Thanks for the patch. Please see my comments inline. >>> >>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >>> <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >>>> While handling the concerned iommu, there should not be a >>>> need to power control the drm devices from iommu interface. >>>> If these drm devices need to be powered around this time, >>>> the respective drivers should take care of this. >>>> >>>> Replace the pm_runtime_get/put_sync(<drm_device>) with >>>> pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up >>>> the connected iommu through the device link interface. >>>> In case the device link is not setup these get/put_suppliers() >>>> calls will be a no-op, and the iommu driver should take care of >>>> powering on its devices accordingly. >>>> >>>> Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >>>> --- >>>> drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- >>>> 1 file changed, 8 insertions(+), 8 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c >>>> index b23d33622f37..1ab629bbee69 100644 >>>> --- a/drivers/gpu/drm/msm/msm_iommu.c >>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c >>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, >>>> struct msm_iommu *iommu = to_msm_iommu(mmu); >>>> int ret; >>>> >>>> - pm_runtime_get_sync(mmu->dev); >>>> + pm_runtime_get_suppliers(mmu->dev); >>>> ret = iommu_attach_device(iommu->domain, mmu->dev); >>>> - pm_runtime_put_sync(mmu->dev); >>>> + pm_runtime_put_suppliers(mmu->dev); >>> >>> For me, it looks like a wrong place to handle runtime PM of IOMMU >>> here. iommu_attach_device() calls into IOMMU driver's attach_device() >>> callback and that's where necessary runtime PM gets should happen, if >>> any. In other words, driver A (MSM DRM driver) shouldn't be dealing >>> with power state of device controlled by driver B (ARM SMMU). >> >> Note that we end up having to do the same, because of iommu_unmap() >> while DRM driver is powered off.. it might be cleaner if it was all >> self contained in the iommu driver, but that would make it so other >> drivers couldn't call iommu_unmap() from an irq handler, which is >> apparently something that some of them want to do.. > > I'd assume that runtime PM status is already guaranteed to be active > when the IRQ handler is running, by some other means (e.g. > pm_runtime_get_sync() called earlier, when queuing some work to the > hardware). Otherwise, I'm not sure how a powered down device could > trigger an IRQ. > > So, if the master device power is already on, suppliers should be > powered on as well, thanks to device links. > umm, that is kindof the inverse of the problem.. the problem is things like gpu driver (and v4l2 drivers that import dma-buf's, afaict).. they will potentially call iommu->unmap() when device is not active (due to userspace or things beyond the control of the driver).. so *they* would want iommu to do pm get/put calls. But other drivers trying to unmap from irq ctx would not. Which is the contradictory requirement that lead to the idea of iommu user powering up iommu for unmap. There has already been some discussion about this on various earlier permutations of this patchset. I think we have exhausted all other options. BR, -R ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAF6AEGuNZJKtwGZ5mLfqNND2jtU+HYM11UONfAtVTzoM0QVpdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAF6AEGuNZJKtwGZ5mLfqNND2jtU+HYM11UONfAtVTzoM0QVpdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-14 3:01 ` Tomasz Figa [not found] ` <CAAFQd5BZJ1G0RG32hYErNzPRvisBhhiSNCBsjbzfm0WzO=DnsQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-14 3:01 UTC (permalink / raw) To: Rob Clark Cc: Vivek Gautam, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>,, Rob Herring, Mark Rutland, Rafael J. Wysocki, Robin Murphy, Will Deacon On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >>>> Hi Vivek, >>>> >>>> Thanks for the patch. Please see my comments inline. >>>> >>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >>>> <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >>>>> While handling the concerned iommu, there should not be a >>>>> need to power control the drm devices from iommu interface. >>>>> If these drm devices need to be powered around this time, >>>>> the respective drivers should take care of this. >>>>> >>>>> Replace the pm_runtime_get/put_sync(<drm_device>) with >>>>> pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up >>>>> the connected iommu through the device link interface. >>>>> In case the device link is not setup these get/put_suppliers() >>>>> calls will be a no-op, and the iommu driver should take care of >>>>> powering on its devices accordingly. >>>>> >>>>> Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >>>>> --- >>>>> drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- >>>>> 1 file changed, 8 insertions(+), 8 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c >>>>> index b23d33622f37..1ab629bbee69 100644 >>>>> --- a/drivers/gpu/drm/msm/msm_iommu.c >>>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c >>>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, >>>>> struct msm_iommu *iommu = to_msm_iommu(mmu); >>>>> int ret; >>>>> >>>>> - pm_runtime_get_sync(mmu->dev); >>>>> + pm_runtime_get_suppliers(mmu->dev); >>>>> ret = iommu_attach_device(iommu->domain, mmu->dev); >>>>> - pm_runtime_put_sync(mmu->dev); >>>>> + pm_runtime_put_suppliers(mmu->dev); >>>> >>>> For me, it looks like a wrong place to handle runtime PM of IOMMU >>>> here. iommu_attach_device() calls into IOMMU driver's attach_device() >>>> callback and that's where necessary runtime PM gets should happen, if >>>> any. In other words, driver A (MSM DRM driver) shouldn't be dealing >>>> with power state of device controlled by driver B (ARM SMMU). >>> >>> Note that we end up having to do the same, because of iommu_unmap() >>> while DRM driver is powered off.. it might be cleaner if it was all >>> self contained in the iommu driver, but that would make it so other >>> drivers couldn't call iommu_unmap() from an irq handler, which is >>> apparently something that some of them want to do.. >> >> I'd assume that runtime PM status is already guaranteed to be active >> when the IRQ handler is running, by some other means (e.g. >> pm_runtime_get_sync() called earlier, when queuing some work to the >> hardware). Otherwise, I'm not sure how a powered down device could >> trigger an IRQ. >> >> So, if the master device power is already on, suppliers should be >> powered on as well, thanks to device links. >> > > umm, that is kindof the inverse of the problem.. the problem is > things like gpu driver (and v4l2 drivers that import dma-buf's, > afaict).. they will potentially call iommu->unmap() when device is not > active (due to userspace or things beyond the control of the driver).. > so *they* would want iommu to do pm get/put calls. Which is fine and which is actually already done by one of the patches in this series, not for map/unmap, but probe, add_device, remove_device. Having parts of the API doing it inside the callback and other parts outside sounds at least inconsistent. > But other drivers > trying to unmap from irq ctx would not. Which is the contradictory > requirement that lead to the idea of iommu user powering up iommu for > unmap. Sorry, maybe I wasn't clear. My last message was supposed to show that it's not contradictory at all, because "other drivers trying to unmap from irq ctx" would already have called pm_runtime_get_*() earlier from a non-irq ctx, which would have also done the same on all the linked suppliers, including the IOMMU. The ultimate result would be that the map/unmap() of the IOMMU driver calling pm_runtime_get_sync() would do nothing besides incrementing the reference count. > > There has already been some discussion about this on various earlier > permutations of this patchset. I think we have exhausted all other > options. I guess I should have read those. Let me do that now. Best regards, Tomasz -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5BZJ1G0RG32hYErNzPRvisBhhiSNCBsjbzfm0WzO=DnsQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAAFQd5BZJ1G0RG32hYErNzPRvisBhhiSNCBsjbzfm0WzO=DnsQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-14 4:17 ` Vivek Gautam [not found] ` <CAFp+6iHaycK=CcE1S15EeuMkaw8LnW0ebptU0hM6tUtWdeEOtA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Vivek Gautam @ 2018-02-14 4:17 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, Linux PM, David Airlie, Will Deacon, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, , Rafael J. Wysocki Hi Tomasz, On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa <tfiga@chromium.org> wrote: > On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark <robdclark@gmail.com> wrote: >> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa <tfiga@chromium.org> wrote: >>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdclark@gmail.com> wrote: >>>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tfiga@chromium.org> wrote: >>>>> Hi Vivek, >>>>> >>>>> Thanks for the patch. Please see my comments inline. >>>>> >>>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >>>>> <vivek.gautam@codeaurora.org> wrote: >>>>>> While handling the concerned iommu, there should not be a >>>>>> need to power control the drm devices from iommu interface. >>>>>> If these drm devices need to be powered around this time, >>>>>> the respective drivers should take care of this. >>>>>> >>>>>> Replace the pm_runtime_get/put_sync(<drm_device>) with >>>>>> pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up >>>>>> the connected iommu through the device link interface. >>>>>> In case the device link is not setup these get/put_suppliers() >>>>>> calls will be a no-op, and the iommu driver should take care of >>>>>> powering on its devices accordingly. >>>>>> >>>>>> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >>>>>> --- >>>>>> drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- >>>>>> 1 file changed, 8 insertions(+), 8 deletions(-) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c >>>>>> index b23d33622f37..1ab629bbee69 100644 >>>>>> --- a/drivers/gpu/drm/msm/msm_iommu.c >>>>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c >>>>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, >>>>>> struct msm_iommu *iommu = to_msm_iommu(mmu); >>>>>> int ret; >>>>>> >>>>>> - pm_runtime_get_sync(mmu->dev); >>>>>> + pm_runtime_get_suppliers(mmu->dev); >>>>>> ret = iommu_attach_device(iommu->domain, mmu->dev); >>>>>> - pm_runtime_put_sync(mmu->dev); >>>>>> + pm_runtime_put_suppliers(mmu->dev); >>>>> >>>>> For me, it looks like a wrong place to handle runtime PM of IOMMU >>>>> here. iommu_attach_device() calls into IOMMU driver's attach_device() >>>>> callback and that's where necessary runtime PM gets should happen, if >>>>> any. In other words, driver A (MSM DRM driver) shouldn't be dealing >>>>> with power state of device controlled by driver B (ARM SMMU). >>>> >>>> Note that we end up having to do the same, because of iommu_unmap() >>>> while DRM driver is powered off.. it might be cleaner if it was all >>>> self contained in the iommu driver, but that would make it so other >>>> drivers couldn't call iommu_unmap() from an irq handler, which is >>>> apparently something that some of them want to do.. >>> >>> I'd assume that runtime PM status is already guaranteed to be active >>> when the IRQ handler is running, by some other means (e.g. >>> pm_runtime_get_sync() called earlier, when queuing some work to the >>> hardware). Otherwise, I'm not sure how a powered down device could >>> trigger an IRQ. >>> >>> So, if the master device power is already on, suppliers should be >>> powered on as well, thanks to device links. >>> >> >> umm, that is kindof the inverse of the problem.. the problem is >> things like gpu driver (and v4l2 drivers that import dma-buf's, >> afaict).. they will potentially call iommu->unmap() when device is not >> active (due to userspace or things beyond the control of the driver).. >> so *they* would want iommu to do pm get/put calls. > > Which is fine and which is actually already done by one of the patches > in this series, not for map/unmap, but probe, add_device, > remove_device. Having parts of the API doing it inside the callback > and other parts outside sounds at least inconsistent. > >> But other drivers >> trying to unmap from irq ctx would not. Which is the contradictory >> requirement that lead to the idea of iommu user powering up iommu for >> unmap. > > Sorry, maybe I wasn't clear. My last message was supposed to show that > it's not contradictory at all, because "other drivers trying to unmap > from irq ctx" would already have called pm_runtime_get_*() earlier > from a non-irq ctx, which would have also done the same on all the > linked suppliers, including the IOMMU. The ultimate result would be > that the map/unmap() of the IOMMU driver calling pm_runtime_get_sync() > would do nothing besides incrementing the reference count. The entire point was to avoid the slowpath that pm_runtime_get/put_sync() would add in map/unmap. It would not be correct to add a slowpath in irq_ctx for taking care of non-irq_ctx and for the situations where master is already powered-off. > >> >> There has already been some discussion about this on various earlier >> permutations of this patchset. I think we have exhausted all other >> options. > > I guess I should have read those. Let me do that now. Yea, i point to the thread in cover letter and [PATCH 1/6]. Thanks. regards Vivek > > Best regards, > Tomasz > -- > To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAFp+6iHaycK=CcE1S15EeuMkaw8LnW0ebptU0hM6tUtWdeEOtA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAFp+6iHaycK=CcE1S15EeuMkaw8LnW0ebptU0hM6tUtWdeEOtA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-14 5:38 ` Tomasz Figa [not found] ` <CAAFQd5Afj-Bj+3wHwmF2tT7y=46EsYEtO_mXfY6stXBgHutEUg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-14 5:38 UTC (permalink / raw) To: Vivek Gautam Cc: Rob Clark, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>,, Rob Herring, Mark Rutland, Rafael J. Wysocki, Robin Murphy, Will Deacon On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: > Hi Tomasz, > > On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >>>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >>>>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >>>>>> Hi Vivek, >>>>>> >>>>>> Thanks for the patch. Please see my comments inline. >>>>>> >>>>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >>>>>> <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >>>>>>> While handling the concerned iommu, there should not be a >>>>>>> need to power control the drm devices from iommu interface. >>>>>>> If these drm devices need to be powered around this time, >>>>>>> the respective drivers should take care of this. >>>>>>> >>>>>>> Replace the pm_runtime_get/put_sync(<drm_device>) with >>>>>>> pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up >>>>>>> the connected iommu through the device link interface. >>>>>>> In case the device link is not setup these get/put_suppliers() >>>>>>> calls will be a no-op, and the iommu driver should take care of >>>>>>> powering on its devices accordingly. >>>>>>> >>>>>>> Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >>>>>>> --- >>>>>>> drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- >>>>>>> 1 file changed, 8 insertions(+), 8 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c >>>>>>> index b23d33622f37..1ab629bbee69 100644 >>>>>>> --- a/drivers/gpu/drm/msm/msm_iommu.c >>>>>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c >>>>>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, >>>>>>> struct msm_iommu *iommu = to_msm_iommu(mmu); >>>>>>> int ret; >>>>>>> >>>>>>> - pm_runtime_get_sync(mmu->dev); >>>>>>> + pm_runtime_get_suppliers(mmu->dev); >>>>>>> ret = iommu_attach_device(iommu->domain, mmu->dev); >>>>>>> - pm_runtime_put_sync(mmu->dev); >>>>>>> + pm_runtime_put_suppliers(mmu->dev); >>>>>> >>>>>> For me, it looks like a wrong place to handle runtime PM of IOMMU >>>>>> here. iommu_attach_device() calls into IOMMU driver's attach_device() >>>>>> callback and that's where necessary runtime PM gets should happen, if >>>>>> any. In other words, driver A (MSM DRM driver) shouldn't be dealing >>>>>> with power state of device controlled by driver B (ARM SMMU). >>>>> >>>>> Note that we end up having to do the same, because of iommu_unmap() >>>>> while DRM driver is powered off.. it might be cleaner if it was all >>>>> self contained in the iommu driver, but that would make it so other >>>>> drivers couldn't call iommu_unmap() from an irq handler, which is >>>>> apparently something that some of them want to do.. >>>> >>>> I'd assume that runtime PM status is already guaranteed to be active >>>> when the IRQ handler is running, by some other means (e.g. >>>> pm_runtime_get_sync() called earlier, when queuing some work to the >>>> hardware). Otherwise, I'm not sure how a powered down device could >>>> trigger an IRQ. >>>> >>>> So, if the master device power is already on, suppliers should be >>>> powered on as well, thanks to device links. >>>> >>> >>> umm, that is kindof the inverse of the problem.. the problem is >>> things like gpu driver (and v4l2 drivers that import dma-buf's, >>> afaict).. they will potentially call iommu->unmap() when device is not >>> active (due to userspace or things beyond the control of the driver).. >>> so *they* would want iommu to do pm get/put calls. >> >> Which is fine and which is actually already done by one of the patches >> in this series, not for map/unmap, but probe, add_device, >> remove_device. Having parts of the API doing it inside the callback >> and other parts outside sounds at least inconsistent. >> >>> But other drivers >>> trying to unmap from irq ctx would not. Which is the contradictory >>> requirement that lead to the idea of iommu user powering up iommu for >>> unmap. >> >> Sorry, maybe I wasn't clear. My last message was supposed to show that >> it's not contradictory at all, because "other drivers trying to unmap >> from irq ctx" would already have called pm_runtime_get_*() earlier >> from a non-irq ctx, which would have also done the same on all the >> linked suppliers, including the IOMMU. The ultimate result would be >> that the map/unmap() of the IOMMU driver calling pm_runtime_get_sync() >> would do nothing besides incrementing the reference count. > > The entire point was to avoid the slowpath that pm_runtime_get/put_sync() > would add in map/unmap. It would not be correct to add a slowpath in irq_ctx > for taking care of non-irq_ctx and for the situations where master is already > powered-off. Correct me if I'm wrong, but I believe that with what I'm proposing there wouldn't be any slow path. a) For IRQ context, the master is already powered on and so the SMMU is also powered on, through respective device link. pm_runtime_get_sync() would ultimately just increment the runtime PM usage count. b) For a case when the master is already powered off (which wouldn't be IRQ context, for the reason stated in a)), powering on the SMMU is unavoidable, if the SMMU hardware really needs to be accessed (i.e. some TLBs need to be invalidated, if their state is preserved despite master being powered down). > >> >>> >>> There has already been some discussion about this on various earlier >>> permutations of this patchset. I think we have exhausted all other >>> options. >> >> I guess I should have read those. Let me do that now. > Yea, i point to the thread in cover letter and [PATCH 1/6]. > Thanks. I read through all the links in the cover letter and I could see other attempts not working out indeed, but they were different from what I'm proposing. There was also a point raised that __pm_runtime_resume() called from pm_runtime_get_sync() would grab dev->power_lock spinlock, which is true, except that if the device is already active, it would do it only for the time of checking device state, so I doubt it would really be a significant point of contention. Best regards, Tomasz -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5Afj-Bj+3wHwmF2tT7y=46EsYEtO_mXfY6stXBgHutEUg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAAFQd5Afj-Bj+3wHwmF2tT7y=46EsYEtO_mXfY6stXBgHutEUg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-14 9:13 ` Vivek Gautam [not found] ` <CAFp+6iGX6pr+MdPSSHHG=qOnhHky_8OHiDqAcJ9UudEUv=JMHg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Vivek Gautam @ 2018-02-14 9:13 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, Linux PM, David Airlie, Will Deacon, Rafael J. Wysocki, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, , dri-devel, Linux Kernel Mailing List, Rob Herring, Greg KH, freedreno, Stephen Boyd, linux-arm-msm Hi Tomasz, On Wed, Feb 14, 2018 at 11:08 AM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: > On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam > <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >> Hi Tomasz, >> >> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >>> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >>>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >>>>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >>>>>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >>>>>>> Hi Vivek, >>>>>>> >>>>>>> Thanks for the patch. Please see my comments inline. >>>>>>> >>>>>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >>>>>>> <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >>>>>>>> While handling the concerned iommu, there should not be a >>>>>>>> need to power control the drm devices from iommu interface. >>>>>>>> If these drm devices need to be powered around this time, >>>>>>>> the respective drivers should take care of this. >>>>>>>> >>>>>>>> Replace the pm_runtime_get/put_sync(<drm_device>) with >>>>>>>> pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up >>>>>>>> the connected iommu through the device link interface. >>>>>>>> In case the device link is not setup these get/put_suppliers() >>>>>>>> calls will be a no-op, and the iommu driver should take care of >>>>>>>> powering on its devices accordingly. >>>>>>>> >>>>>>>> Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >>>>>>>> --- >>>>>>>> drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- >>>>>>>> 1 file changed, 8 insertions(+), 8 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>> index b23d33622f37..1ab629bbee69 100644 >>>>>>>> --- a/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, >>>>>>>> struct msm_iommu *iommu = to_msm_iommu(mmu); >>>>>>>> int ret; >>>>>>>> >>>>>>>> - pm_runtime_get_sync(mmu->dev); >>>>>>>> + pm_runtime_get_suppliers(mmu->dev); >>>>>>>> ret = iommu_attach_device(iommu->domain, mmu->dev); >>>>>>>> - pm_runtime_put_sync(mmu->dev); >>>>>>>> + pm_runtime_put_suppliers(mmu->dev); >>>>>>> >>>>>>> For me, it looks like a wrong place to handle runtime PM of IOMMU >>>>>>> here. iommu_attach_device() calls into IOMMU driver's attach_device() >>>>>>> callback and that's where necessary runtime PM gets should happen, if >>>>>>> any. In other words, driver A (MSM DRM driver) shouldn't be dealing >>>>>>> with power state of device controlled by driver B (ARM SMMU). >>>>>> >>>>>> Note that we end up having to do the same, because of iommu_unmap() >>>>>> while DRM driver is powered off.. it might be cleaner if it was all >>>>>> self contained in the iommu driver, but that would make it so other >>>>>> drivers couldn't call iommu_unmap() from an irq handler, which is >>>>>> apparently something that some of them want to do.. >>>>> >>>>> I'd assume that runtime PM status is already guaranteed to be active >>>>> when the IRQ handler is running, by some other means (e.g. >>>>> pm_runtime_get_sync() called earlier, when queuing some work to the >>>>> hardware). Otherwise, I'm not sure how a powered down device could >>>>> trigger an IRQ. >>>>> >>>>> So, if the master device power is already on, suppliers should be >>>>> powered on as well, thanks to device links. >>>>> >>>> >>>> umm, that is kindof the inverse of the problem.. the problem is >>>> things like gpu driver (and v4l2 drivers that import dma-buf's, >>>> afaict).. they will potentially call iommu->unmap() when device is not >>>> active (due to userspace or things beyond the control of the driver).. >>>> so *they* would want iommu to do pm get/put calls. >>> >>> Which is fine and which is actually already done by one of the patches >>> in this series, not for map/unmap, but probe, add_device, >>> remove_device. Having parts of the API doing it inside the callback >>> and other parts outside sounds at least inconsistent. >>> >>>> But other drivers >>>> trying to unmap from irq ctx would not. Which is the contradictory >>>> requirement that lead to the idea of iommu user powering up iommu for >>>> unmap. >>> >>> Sorry, maybe I wasn't clear. My last message was supposed to show that >>> it's not contradictory at all, because "other drivers trying to unmap >>> from irq ctx" would already have called pm_runtime_get_*() earlier >>> from a non-irq ctx, which would have also done the same on all the >>> linked suppliers, including the IOMMU. The ultimate result would be >>> that the map/unmap() of the IOMMU driver calling pm_runtime_get_sync() >>> would do nothing besides incrementing the reference count. >> >> The entire point was to avoid the slowpath that pm_runtime_get/put_sync() >> would add in map/unmap. It would not be correct to add a slowpath in irq_ctx >> for taking care of non-irq_ctx and for the situations where master is already >> powered-off. > > Correct me if I'm wrong, but I believe that with what I'm proposing > there wouldn't be any slow path. Yea, but only when the power domain is irq-safe? And not all platforms enable irq-safe power domains. For instance, msm doesn't enable its gdsc power domains as irq-safe. Is it something i am missing? > > a) For IRQ context, the master is already powered on and so the SMMU > is also powered on, through respective device link. > pm_runtime_get_sync() would ultimately just increment the runtime PM > usage count. > > b) For a case when the master is already powered off (which wouldn't > be IRQ context, for the reason stated in a)), powering on the SMMU is > unavoidable, if the SMMU hardware really needs to be accessed (i.e. > some TLBs need to be invalidated, if their state is preserved despite > master being powered down). > >> >>> >>>> >>>> There has already been some discussion about this on various earlier >>>> permutations of this patchset. I think we have exhausted all other >>>> options. >>> >>> I guess I should have read those. Let me do that now. >> Yea, i point to the thread in cover letter and [PATCH 1/6]. >> Thanks. > > I read through all the links in the cover letter and I could see other > attempts not working out indeed, but they were different from what I'm > proposing. > > There was also a point raised that __pm_runtime_resume() called from > pm_runtime_get_sync() would grab dev->power_lock spinlock, which is > true, except that if the device is already active, it would do it only > for the time of checking device state, so I doubt it would really be a > significant point of contention. > > Best regards, > Tomasz > -- > To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAFp+6iGX6pr+MdPSSHHG=qOnhHky_8OHiDqAcJ9UudEUv=JMHg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAFp+6iGX6pr+MdPSSHHG=qOnhHky_8OHiDqAcJ9UudEUv=JMHg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-14 9:16 ` Tomasz Figa [not found] ` <CAAFQd5DiwAugGnPOTw0+XrEfef9x-n-vx59JFuXpNawjiXHwCw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-14 9:16 UTC (permalink / raw) To: Vivek Gautam Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, Linux PM, David Airlie, Will Deacon, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, , Rafael J. Wysocki On Wed, Feb 14, 2018 at 6:13 PM, Vivek Gautam <vivek.gautam@codeaurora.org> wrote: > Hi Tomasz, > > On Wed, Feb 14, 2018 at 11:08 AM, Tomasz Figa <tfiga@chromium.org> wrote: >> On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam >> <vivek.gautam@codeaurora.org> wrote: >>> Hi Tomasz, >>> >>> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa <tfiga@chromium.org> wrote: >>>> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark <robdclark@gmail.com> wrote: >>>>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa <tfiga@chromium.org> wrote: >>>>>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdclark@gmail.com> wrote: >>>>>>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tfiga@chromium.org> wrote: >>>>>>>> Hi Vivek, >>>>>>>> >>>>>>>> Thanks for the patch. Please see my comments inline. >>>>>>>> >>>>>>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >>>>>>>> <vivek.gautam@codeaurora.org> wrote: >>>>>>>>> While handling the concerned iommu, there should not be a >>>>>>>>> need to power control the drm devices from iommu interface. >>>>>>>>> If these drm devices need to be powered around this time, >>>>>>>>> the respective drivers should take care of this. >>>>>>>>> >>>>>>>>> Replace the pm_runtime_get/put_sync(<drm_device>) with >>>>>>>>> pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up >>>>>>>>> the connected iommu through the device link interface. >>>>>>>>> In case the device link is not setup these get/put_suppliers() >>>>>>>>> calls will be a no-op, and the iommu driver should take care of >>>>>>>>> powering on its devices accordingly. >>>>>>>>> >>>>>>>>> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >>>>>>>>> --- >>>>>>>>> drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- >>>>>>>>> 1 file changed, 8 insertions(+), 8 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>> index b23d33622f37..1ab629bbee69 100644 >>>>>>>>> --- a/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, >>>>>>>>> struct msm_iommu *iommu = to_msm_iommu(mmu); >>>>>>>>> int ret; >>>>>>>>> >>>>>>>>> - pm_runtime_get_sync(mmu->dev); >>>>>>>>> + pm_runtime_get_suppliers(mmu->dev); >>>>>>>>> ret = iommu_attach_device(iommu->domain, mmu->dev); >>>>>>>>> - pm_runtime_put_sync(mmu->dev); >>>>>>>>> + pm_runtime_put_suppliers(mmu->dev); >>>>>>>> >>>>>>>> For me, it looks like a wrong place to handle runtime PM of IOMMU >>>>>>>> here. iommu_attach_device() calls into IOMMU driver's attach_device() >>>>>>>> callback and that's where necessary runtime PM gets should happen, if >>>>>>>> any. In other words, driver A (MSM DRM driver) shouldn't be dealing >>>>>>>> with power state of device controlled by driver B (ARM SMMU). >>>>>>> >>>>>>> Note that we end up having to do the same, because of iommu_unmap() >>>>>>> while DRM driver is powered off.. it might be cleaner if it was all >>>>>>> self contained in the iommu driver, but that would make it so other >>>>>>> drivers couldn't call iommu_unmap() from an irq handler, which is >>>>>>> apparently something that some of them want to do.. >>>>>> >>>>>> I'd assume that runtime PM status is already guaranteed to be active >>>>>> when the IRQ handler is running, by some other means (e.g. >>>>>> pm_runtime_get_sync() called earlier, when queuing some work to the >>>>>> hardware). Otherwise, I'm not sure how a powered down device could >>>>>> trigger an IRQ. >>>>>> >>>>>> So, if the master device power is already on, suppliers should be >>>>>> powered on as well, thanks to device links. >>>>>> >>>>> >>>>> umm, that is kindof the inverse of the problem.. the problem is >>>>> things like gpu driver (and v4l2 drivers that import dma-buf's, >>>>> afaict).. they will potentially call iommu->unmap() when device is not >>>>> active (due to userspace or things beyond the control of the driver).. >>>>> so *they* would want iommu to do pm get/put calls. >>>> >>>> Which is fine and which is actually already done by one of the patches >>>> in this series, not for map/unmap, but probe, add_device, >>>> remove_device. Having parts of the API doing it inside the callback >>>> and other parts outside sounds at least inconsistent. >>>> >>>>> But other drivers >>>>> trying to unmap from irq ctx would not. Which is the contradictory >>>>> requirement that lead to the idea of iommu user powering up iommu for >>>>> unmap. >>>> >>>> Sorry, maybe I wasn't clear. My last message was supposed to show that >>>> it's not contradictory at all, because "other drivers trying to unmap >>>> from irq ctx" would already have called pm_runtime_get_*() earlier >>>> from a non-irq ctx, which would have also done the same on all the >>>> linked suppliers, including the IOMMU. The ultimate result would be >>>> that the map/unmap() of the IOMMU driver calling pm_runtime_get_sync() >>>> would do nothing besides incrementing the reference count. >>> >>> The entire point was to avoid the slowpath that pm_runtime_get/put_sync() >>> would add in map/unmap. It would not be correct to add a slowpath in irq_ctx >>> for taking care of non-irq_ctx and for the situations where master is already >>> powered-off. >> >> Correct me if I'm wrong, but I believe that with what I'm proposing >> there wouldn't be any slow path. > > Yea, but only when the power domain is irq-safe? And not all platforms > enable irq-safe power domains. For instance, msm doesn't enable its > gdsc power domains as irq-safe. > Is it something i am missing? irq-safe would matter if there would exist a case when the call is done from IRQ context and the power is off. As I explained in a), it shouldn't happen. Best regards, Tomasz > >> >> a) For IRQ context, the master is already powered on and so the SMMU >> is also powered on, through respective device link. >> pm_runtime_get_sync() would ultimately just increment the runtime PM >> usage count. >> >> b) For a case when the master is already powered off (which wouldn't >> be IRQ context, for the reason stated in a)), powering on the SMMU is >> unavoidable, if the SMMU hardware really needs to be accessed (i.e. >> some TLBs need to be invalidated, if their state is preserved despite >> master being powered down). >> >>> >>>> >>>>> >>>>> There has already been some discussion about this on various earlier >>>>> permutations of this patchset. I think we have exhausted all other >>>>> options. >>>> >>>> I guess I should have read those. Let me do that now. >>> Yea, i point to the thread in cover letter and [PATCH 1/6]. >>> Thanks. >> >> I read through all the links in the cover letter and I could see other >> attempts not working out indeed, but they were different from what I'm >> proposing. >> >> There was also a point raised that __pm_runtime_resume() called from >> pm_runtime_get_sync() would grab dev->power_lock spinlock, which is >> true, except that if the device is already active, it would do it only >> for the time of checking device state, so I doubt it would really be a >> significant point of contention. >> >> Best regards, >> Tomasz >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member > of Code Aurora Forum, hosted by The Linux Foundation _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5DiwAugGnPOTw0+XrEfef9x-n-vx59JFuXpNawjiXHwCw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAAFQd5DiwAugGnPOTw0+XrEfef9x-n-vx59JFuXpNawjiXHwCw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-14 10:33 ` Vivek Gautam [not found] ` <CAFp+6iEW0faeHDfzN_F1bRrHGcVo3sPCk4HSY=t9dnEvHkDkYw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Vivek Gautam @ 2018-02-14 10:33 UTC (permalink / raw) To: Robin Murphy, Will Deacon Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, Linux PM, David Airlie, Linux Kernel Mailing List, Rafael J. Wysocki, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS <iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>, Joerg Roedel <joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org>, , dri-devel, Rob Herring, Greg KH, freedreno, Stephen Boyd, linux-arm-msm On Wed, Feb 14, 2018 at 2:46 PM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: Adding Jordan to this thread as well. > On Wed, Feb 14, 2018 at 6:13 PM, Vivek Gautam > <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >> Hi Tomasz, >> >> On Wed, Feb 14, 2018 at 11:08 AM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >>> On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam >>> <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >>>> Hi Tomasz, >>>> >>>> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >>>>> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >>>>>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >>>>>>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: >>>>>>>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >>>>>>>>> Hi Vivek, >>>>>>>>> >>>>>>>>> Thanks for the patch. Please see my comments inline. >>>>>>>>> >>>>>>>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >>>>>>>>> <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >>>>>>>>>> While handling the concerned iommu, there should not be a >>>>>>>>>> need to power control the drm devices from iommu interface. >>>>>>>>>> If these drm devices need to be powered around this time, >>>>>>>>>> the respective drivers should take care of this. >>>>>>>>>> >>>>>>>>>> Replace the pm_runtime_get/put_sync(<drm_device>) with >>>>>>>>>> pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up >>>>>>>>>> the connected iommu through the device link interface. >>>>>>>>>> In case the device link is not setup these get/put_suppliers() >>>>>>>>>> calls will be a no-op, and the iommu driver should take care of >>>>>>>>>> powering on its devices accordingly. >>>>>>>>>> >>>>>>>>>> Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >>>>>>>>>> --- >>>>>>>>>> drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- >>>>>>>>>> 1 file changed, 8 insertions(+), 8 deletions(-) >>>>>>>>>> >>>>>>>>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>>> index b23d33622f37..1ab629bbee69 100644 >>>>>>>>>> --- a/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, >>>>>>>>>> struct msm_iommu *iommu = to_msm_iommu(mmu); >>>>>>>>>> int ret; >>>>>>>>>> >>>>>>>>>> - pm_runtime_get_sync(mmu->dev); >>>>>>>>>> + pm_runtime_get_suppliers(mmu->dev); >>>>>>>>>> ret = iommu_attach_device(iommu->domain, mmu->dev); >>>>>>>>>> - pm_runtime_put_sync(mmu->dev); >>>>>>>>>> + pm_runtime_put_suppliers(mmu->dev); >>>>>>>>> >>>>>>>>> For me, it looks like a wrong place to handle runtime PM of IOMMU >>>>>>>>> here. iommu_attach_device() calls into IOMMU driver's attach_device() >>>>>>>>> callback and that's where necessary runtime PM gets should happen, if >>>>>>>>> any. In other words, driver A (MSM DRM driver) shouldn't be dealing >>>>>>>>> with power state of device controlled by driver B (ARM SMMU). >>>>>>>> >>>>>>>> Note that we end up having to do the same, because of iommu_unmap() >>>>>>>> while DRM driver is powered off.. it might be cleaner if it was all >>>>>>>> self contained in the iommu driver, but that would make it so other >>>>>>>> drivers couldn't call iommu_unmap() from an irq handler, which is >>>>>>>> apparently something that some of them want to do.. >>>>>>> >>>>>>> I'd assume that runtime PM status is already guaranteed to be active >>>>>>> when the IRQ handler is running, by some other means (e.g. >>>>>>> pm_runtime_get_sync() called earlier, when queuing some work to the >>>>>>> hardware). Otherwise, I'm not sure how a powered down device could >>>>>>> trigger an IRQ. >>>>>>> >>>>>>> So, if the master device power is already on, suppliers should be >>>>>>> powered on as well, thanks to device links. >>>>>>> >>>>>> >>>>>> umm, that is kindof the inverse of the problem.. the problem is >>>>>> things like gpu driver (and v4l2 drivers that import dma-buf's, >>>>>> afaict).. they will potentially call iommu->unmap() when device is not >>>>>> active (due to userspace or things beyond the control of the driver).. >>>>>> so *they* would want iommu to do pm get/put calls. >>>>> >>>>> Which is fine and which is actually already done by one of the patches >>>>> in this series, not for map/unmap, but probe, add_device, >>>>> remove_device. Having parts of the API doing it inside the callback >>>>> and other parts outside sounds at least inconsistent. >>>>> >>>>>> But other drivers >>>>>> trying to unmap from irq ctx would not. Which is the contradictory >>>>>> requirement that lead to the idea of iommu user powering up iommu for >>>>>> unmap. >>>>> >>>>> Sorry, maybe I wasn't clear. My last message was supposed to show that >>>>> it's not contradictory at all, because "other drivers trying to unmap >>>>> from irq ctx" would already have called pm_runtime_get_*() earlier >>>>> from a non-irq ctx, which would have also done the same on all the >>>>> linked suppliers, including the IOMMU. The ultimate result would be >>>>> that the map/unmap() of the IOMMU driver calling pm_runtime_get_sync() >>>>> would do nothing besides incrementing the reference count. >>>> >>>> The entire point was to avoid the slowpath that pm_runtime_get/put_sync() >>>> would add in map/unmap. It would not be correct to add a slowpath in irq_ctx >>>> for taking care of non-irq_ctx and for the situations where master is already >>>> powered-off. >>> >>> Correct me if I'm wrong, but I believe that with what I'm proposing >>> there wouldn't be any slow path. >> >> Yea, but only when the power domain is irq-safe? And not all platforms >> enable irq-safe power domains. For instance, msm doesn't enable its >> gdsc power domains as irq-safe. >> Is it something i am missing? > > irq-safe would matter if there would exist a case when the call is > done from IRQ context and the power is off. As I explained in a), it > shouldn't happen. Hi Robin, Will Does adding pm_runtime_get() in map/unmap sounds good to you? Quoting Tomasz once again here: >>> a) For IRQ context, the master is already powered on and so the SMMU >>> is also powered on, through respective device link. >>> pm_runtime_get_sync() would ultimately just increment the runtime PM >>> usage count. >>> >>> b) For a case when the master is already powered off (which wouldn't >>> be IRQ context, for the reason stated in a)), powering on the SMMU is >>> unavoidable, if the SMMU hardware really needs to be accessed (i.e. >>> some TLBs need to be invalidated, if their state is preserved despite >>> master being powered down). >>> There was also a point raised that __pm_runtime_resume() called from >>> pm_runtime_get_sync() would grab dev->power_lock spinlock, which is >>> true, except that if the device is already active, it would do it only >>> for the time of checking device state, so I doubt it would really be a >>> significant point of contention. Regards Vivek > > Best regards, > Tomasz > >> >>> >>> a) For IRQ context, the master is already powered on and so the SMMU >>> is also powered on, through respective device link. >>> pm_runtime_get_sync() would ultimately just increment the runtime PM >>> usage count. >>> >>> b) For a case when the master is already powered off (which wouldn't >>> be IRQ context, for the reason stated in a)), powering on the SMMU is >>> unavoidable, if the SMMU hardware really needs to be accessed (i.e. >>> some TLBs need to be invalidated, if their state is preserved despite >>> master being powered down). >>> >>>> >>>>> >>>>>> >>>>>> There has already been some discussion about this on various earlier >>>>>> permutations of this patchset. I think we have exhausted all other >>>>>> options. >>>>> >>>>> I guess I should have read those. Let me do that now. >>>> Yea, i point to the thread in cover letter and [PATCH 1/6]. >>>> Thanks. >>> >>> I read through all the links in the cover letter and I could see other >>> attempts not working out indeed, but they were different from what I'm >>> proposing. >>> >>> There was also a point raised that __pm_runtime_resume() called from >>> pm_runtime_get_sync() would grab dev->power_lock spinlock, which is >>> true, except that if the device is already active, it would do it only >>> for the time of checking device state, so I doubt it would really be a >>> significant point of contention. >>> >>> Best regards, >>> Tomasz >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in >>> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> >> >> -- >> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member >> of Code Aurora Forum, hosted by The Linux Foundation > -- > To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAFp+6iEW0faeHDfzN_F1bRrHGcVo3sPCk4HSY=t9dnEvHkDkYw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAFp+6iEW0faeHDfzN_F1bRrHGcVo3sPCk4HSY=t9dnEvHkDkYw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-14 16:03 ` Robin Murphy 2018-02-15 3:17 ` Tomasz Figa 0 siblings, 1 reply; 53+ messages in thread From: Robin Murphy @ 2018-02-14 16:03 UTC (permalink / raw) To: Vivek Gautam, Will Deacon Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, jcrouse-sgV2jX0FEOL9JmXXK+q4OQ, Linux PM, David Airlie, Linux Kernel Mailing List, Joerg Roedel, Rafael J. Wysocki, iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, dri-devel, Tomasz Figa, Rob Clark, Rob Herring, Greg KH, freedreno, Stephen Boyd, linux-arm-msm On 14/02/18 10:33, Vivek Gautam wrote: > On Wed, Feb 14, 2018 at 2:46 PM, Tomasz Figa <tfiga@chromium.org> wrote: > > Adding Jordan to this thread as well. > >> On Wed, Feb 14, 2018 at 6:13 PM, Vivek Gautam >> <vivek.gautam@codeaurora.org> wrote: >>> Hi Tomasz, >>> >>> On Wed, Feb 14, 2018 at 11:08 AM, Tomasz Figa <tfiga@chromium.org> wrote: >>>> On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam >>>> <vivek.gautam@codeaurora.org> wrote: >>>>> Hi Tomasz, >>>>> >>>>> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa <tfiga@chromium.org> wrote: >>>>>> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark <robdclark@gmail.com> wrote: >>>>>>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa <tfiga@chromium.org> wrote: >>>>>>>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdclark@gmail.com> wrote: >>>>>>>>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tfiga@chromium.org> wrote: >>>>>>>>>> Hi Vivek, >>>>>>>>>> >>>>>>>>>> Thanks for the patch. Please see my comments inline. >>>>>>>>>> >>>>>>>>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >>>>>>>>>> <vivek.gautam@codeaurora.org> wrote: >>>>>>>>>>> While handling the concerned iommu, there should not be a >>>>>>>>>>> need to power control the drm devices from iommu interface. >>>>>>>>>>> If these drm devices need to be powered around this time, >>>>>>>>>>> the respective drivers should take care of this. >>>>>>>>>>> >>>>>>>>>>> Replace the pm_runtime_get/put_sync(<drm_device>) with >>>>>>>>>>> pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up >>>>>>>>>>> the connected iommu through the device link interface. >>>>>>>>>>> In case the device link is not setup these get/put_suppliers() >>>>>>>>>>> calls will be a no-op, and the iommu driver should take care of >>>>>>>>>>> powering on its devices accordingly. >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >>>>>>>>>>> --- >>>>>>>>>>> drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- >>>>>>>>>>> 1 file changed, 8 insertions(+), 8 deletions(-) >>>>>>>>>>> >>>>>>>>>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>>>> index b23d33622f37..1ab629bbee69 100644 >>>>>>>>>>> --- a/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu *mmu, const char * const *names, >>>>>>>>>>> struct msm_iommu *iommu = to_msm_iommu(mmu); >>>>>>>>>>> int ret; >>>>>>>>>>> >>>>>>>>>>> - pm_runtime_get_sync(mmu->dev); >>>>>>>>>>> + pm_runtime_get_suppliers(mmu->dev); >>>>>>>>>>> ret = iommu_attach_device(iommu->domain, mmu->dev); >>>>>>>>>>> - pm_runtime_put_sync(mmu->dev); >>>>>>>>>>> + pm_runtime_put_suppliers(mmu->dev); >>>>>>>>>> >>>>>>>>>> For me, it looks like a wrong place to handle runtime PM of IOMMU >>>>>>>>>> here. iommu_attach_device() calls into IOMMU driver's attach_device() >>>>>>>>>> callback and that's where necessary runtime PM gets should happen, if >>>>>>>>>> any. In other words, driver A (MSM DRM driver) shouldn't be dealing >>>>>>>>>> with power state of device controlled by driver B (ARM SMMU). >>>>>>>>> >>>>>>>>> Note that we end up having to do the same, because of iommu_unmap() >>>>>>>>> while DRM driver is powered off.. it might be cleaner if it was all >>>>>>>>> self contained in the iommu driver, but that would make it so other >>>>>>>>> drivers couldn't call iommu_unmap() from an irq handler, which is >>>>>>>>> apparently something that some of them want to do.. >>>>>>>> >>>>>>>> I'd assume that runtime PM status is already guaranteed to be active >>>>>>>> when the IRQ handler is running, by some other means (e.g. >>>>>>>> pm_runtime_get_sync() called earlier, when queuing some work to the >>>>>>>> hardware). Otherwise, I'm not sure how a powered down device could >>>>>>>> trigger an IRQ. >>>>>>>> >>>>>>>> So, if the master device power is already on, suppliers should be >>>>>>>> powered on as well, thanks to device links. >>>>>>>> >>>>>>> >>>>>>> umm, that is kindof the inverse of the problem.. the problem is >>>>>>> things like gpu driver (and v4l2 drivers that import dma-buf's, >>>>>>> afaict).. they will potentially call iommu->unmap() when device is not >>>>>>> active (due to userspace or things beyond the control of the driver).. >>>>>>> so *they* would want iommu to do pm get/put calls. >>>>>> >>>>>> Which is fine and which is actually already done by one of the patches >>>>>> in this series, not for map/unmap, but probe, add_device, >>>>>> remove_device. Having parts of the API doing it inside the callback >>>>>> and other parts outside sounds at least inconsistent. >>>>>> >>>>>>> But other drivers >>>>>>> trying to unmap from irq ctx would not. Which is the contradictory >>>>>>> requirement that lead to the idea of iommu user powering up iommu for >>>>>>> unmap. >>>>>> >>>>>> Sorry, maybe I wasn't clear. My last message was supposed to show that >>>>>> it's not contradictory at all, because "other drivers trying to unmap >>>>>> from irq ctx" would already have called pm_runtime_get_*() earlier >>>>>> from a non-irq ctx, which would have also done the same on all the >>>>>> linked suppliers, including the IOMMU. The ultimate result would be >>>>>> that the map/unmap() of the IOMMU driver calling pm_runtime_get_sync() >>>>>> would do nothing besides incrementing the reference count. >>>>> >>>>> The entire point was to avoid the slowpath that pm_runtime_get/put_sync() >>>>> would add in map/unmap. It would not be correct to add a slowpath in irq_ctx >>>>> for taking care of non-irq_ctx and for the situations where master is already >>>>> powered-off. >>>> >>>> Correct me if I'm wrong, but I believe that with what I'm proposing >>>> there wouldn't be any slow path. >>> >>> Yea, but only when the power domain is irq-safe? And not all platforms >>> enable irq-safe power domains. For instance, msm doesn't enable its >>> gdsc power domains as irq-safe. >>> Is it something i am missing? >> >> irq-safe would matter if there would exist a case when the call is >> done from IRQ context and the power is off. As I explained in a), it >> shouldn't happen. > > Hi Robin, Will > > Does adding pm_runtime_get() in map/unmap sounds good to you? Given that we spent significant effort last year removing as much locking as we possibly could from the map/unmap path to minimise the significant performance impact it was having on networking/storage/etc. workloads, I really don't want to introduce more for the sake of one specific use-case, so no. Robin. _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers 2018-02-14 16:03 ` Robin Murphy @ 2018-02-15 3:17 ` Tomasz Figa [not found] ` <CAAFQd5AmG1zSm+CouXOCJbs8SNGFk1-RqfU1nWGjMGJMB-qfvw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-15 3:17 UTC (permalink / raw) To: Robin Murphy Cc: Vivek Gautam, Will Deacon, Rob Clark, list@263.net:IOMMU DRIVERS, Joerg Roedel, Rob Herring, Mark Rutland, Rafael J. Wysocki, devicetree, Linux Kernel Mailing List, Linux PM, dri-devel, freedreno, David Airlie, Greg KH, Stephen Boyd, linux-arm-msm, jcrouse On Thu, Feb 15, 2018 at 1:03 AM, Robin Murphy <robin.murphy@arm.com> wrote: > On 14/02/18 10:33, Vivek Gautam wrote: >> >> On Wed, Feb 14, 2018 at 2:46 PM, Tomasz Figa <tfiga@chromium.org> wrote: >> >> Adding Jordan to this thread as well. >> >>> On Wed, Feb 14, 2018 at 6:13 PM, Vivek Gautam >>> <vivek.gautam@codeaurora.org> wrote: >>>> >>>> Hi Tomasz, >>>> >>>> On Wed, Feb 14, 2018 at 11:08 AM, Tomasz Figa <tfiga@chromium.org> >>>> wrote: >>>>> >>>>> On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam >>>>> <vivek.gautam@codeaurora.org> wrote: >>>>>> >>>>>> Hi Tomasz, >>>>>> >>>>>> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa <tfiga@chromium.org> >>>>>> wrote: >>>>>>> >>>>>>> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark <robdclark@gmail.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa <tfiga@chromium.org> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdclark@gmail.com> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tfiga@chromium.org> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Vivek, >>>>>>>>>>> >>>>>>>>>>> Thanks for the patch. Please see my comments inline. >>>>>>>>>>> >>>>>>>>>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >>>>>>>>>>> <vivek.gautam@codeaurora.org> wrote: >>>>>>>>>>>> >>>>>>>>>>>> While handling the concerned iommu, there should not be a >>>>>>>>>>>> need to power control the drm devices from iommu interface. >>>>>>>>>>>> If these drm devices need to be powered around this time, >>>>>>>>>>>> the respective drivers should take care of this. >>>>>>>>>>>> >>>>>>>>>>>> Replace the pm_runtime_get/put_sync(<drm_device>) with >>>>>>>>>>>> pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up >>>>>>>>>>>> the connected iommu through the device link interface. >>>>>>>>>>>> In case the device link is not setup these get/put_suppliers() >>>>>>>>>>>> calls will be a no-op, and the iommu driver should take care of >>>>>>>>>>>> powering on its devices accordingly. >>>>>>>>>>>> >>>>>>>>>>>> Signed-off-by: Vivek Gautam <vivek.gautam@codeaurora.org> >>>>>>>>>>>> --- >>>>>>>>>>>> drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- >>>>>>>>>>>> 1 file changed, 8 insertions(+), 8 deletions(-) >>>>>>>>>>>> >>>>>>>>>>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>>>>> b/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>>>>> index b23d33622f37..1ab629bbee69 100644 >>>>>>>>>>>> --- a/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>>>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>>>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu >>>>>>>>>>>> *mmu, const char * const *names, >>>>>>>>>>>> struct msm_iommu *iommu = to_msm_iommu(mmu); >>>>>>>>>>>> int ret; >>>>>>>>>>>> >>>>>>>>>>>> - pm_runtime_get_sync(mmu->dev); >>>>>>>>>>>> + pm_runtime_get_suppliers(mmu->dev); >>>>>>>>>>>> ret = iommu_attach_device(iommu->domain, mmu->dev); >>>>>>>>>>>> - pm_runtime_put_sync(mmu->dev); >>>>>>>>>>>> + pm_runtime_put_suppliers(mmu->dev); >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> For me, it looks like a wrong place to handle runtime PM of IOMMU >>>>>>>>>>> here. iommu_attach_device() calls into IOMMU driver's >>>>>>>>>>> attach_device() >>>>>>>>>>> callback and that's where necessary runtime PM gets should >>>>>>>>>>> happen, if >>>>>>>>>>> any. In other words, driver A (MSM DRM driver) shouldn't be >>>>>>>>>>> dealing >>>>>>>>>>> with power state of device controlled by driver B (ARM SMMU). >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Note that we end up having to do the same, because of >>>>>>>>>> iommu_unmap() >>>>>>>>>> while DRM driver is powered off.. it might be cleaner if it was >>>>>>>>>> all >>>>>>>>>> self contained in the iommu driver, but that would make it so >>>>>>>>>> other >>>>>>>>>> drivers couldn't call iommu_unmap() from an irq handler, which is >>>>>>>>>> apparently something that some of them want to do.. >>>>>>>>> >>>>>>>>> >>>>>>>>> I'd assume that runtime PM status is already guaranteed to be >>>>>>>>> active >>>>>>>>> when the IRQ handler is running, by some other means (e.g. >>>>>>>>> pm_runtime_get_sync() called earlier, when queuing some work to the >>>>>>>>> hardware). Otherwise, I'm not sure how a powered down device could >>>>>>>>> trigger an IRQ. >>>>>>>>> >>>>>>>>> So, if the master device power is already on, suppliers should be >>>>>>>>> powered on as well, thanks to device links. >>>>>>>>> >>>>>>>> >>>>>>>> umm, that is kindof the inverse of the problem.. the problem is >>>>>>>> things like gpu driver (and v4l2 drivers that import dma-buf's, >>>>>>>> afaict).. they will potentially call iommu->unmap() when device is >>>>>>>> not >>>>>>>> active (due to userspace or things beyond the control of the >>>>>>>> driver).. >>>>>>>> so *they* would want iommu to do pm get/put calls. >>>>>>> >>>>>>> >>>>>>> Which is fine and which is actually already done by one of the >>>>>>> patches >>>>>>> in this series, not for map/unmap, but probe, add_device, >>>>>>> remove_device. Having parts of the API doing it inside the callback >>>>>>> and other parts outside sounds at least inconsistent. >>>>>>> >>>>>>>> But other drivers >>>>>>>> trying to unmap from irq ctx would not. Which is the contradictory >>>>>>>> requirement that lead to the idea of iommu user powering up iommu >>>>>>>> for >>>>>>>> unmap. >>>>>>> >>>>>>> >>>>>>> Sorry, maybe I wasn't clear. My last message was supposed to show >>>>>>> that >>>>>>> it's not contradictory at all, because "other drivers trying to unmap >>>>>>> from irq ctx" would already have called pm_runtime_get_*() earlier >>>>>>> from a non-irq ctx, which would have also done the same on all the >>>>>>> linked suppliers, including the IOMMU. The ultimate result would be >>>>>>> that the map/unmap() of the IOMMU driver calling >>>>>>> pm_runtime_get_sync() >>>>>>> would do nothing besides incrementing the reference count. >>>>>> >>>>>> >>>>>> The entire point was to avoid the slowpath that >>>>>> pm_runtime_get/put_sync() >>>>>> would add in map/unmap. It would not be correct to add a slowpath in >>>>>> irq_ctx >>>>>> for taking care of non-irq_ctx and for the situations where master is >>>>>> already >>>>>> powered-off. >>>>> >>>>> >>>>> Correct me if I'm wrong, but I believe that with what I'm proposing >>>>> there wouldn't be any slow path. >>>> >>>> >>>> Yea, but only when the power domain is irq-safe? And not all platforms >>>> enable irq-safe power domains. For instance, msm doesn't enable its >>>> gdsc power domains as irq-safe. >>>> Is it something i am missing? >>> >>> >>> irq-safe would matter if there would exist a case when the call is >>> done from IRQ context and the power is off. As I explained in a), it >>> shouldn't happen. >> >> >> Hi Robin, Will >> >> Does adding pm_runtime_get() in map/unmap sounds good to you? > > > Given that we spent significant effort last year removing as much locking as > we possibly could from the map/unmap path to minimise the significant > performance impact it was having on networking/storage/etc. workloads, I > really don't want to introduce more for the sake of one specific use-case, > so no. Could you elaborate on what kind of locking you are concerned about? As I explained before, the normally happening fast path would lock dev->power_lock only for the brief moment of incrementing the runtime PM usage counter. Best regards, Tomasz ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5AmG1zSm+CouXOCJbs8SNGFk1-RqfU1nWGjMGJMB-qfvw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAAFQd5AmG1zSm+CouXOCJbs8SNGFk1-RqfU1nWGjMGJMB-qfvw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-15 4:17 ` Tomasz Figa [not found] ` <CAAFQd5A9B-di9svtiJbvk2hz1U1xo61rTY5vt6AD+KR5iMcG-A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-15 4:17 UTC (permalink / raw) To: Robin Murphy Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, Linux PM, David Airlie, Rafael J. Wysocki, Will Deacon, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS, dri-devel, Linux Kernel Mailing List, Rob Herring, Greg KH, freedreno, Stephen Boyd, linux-arm-msm On Thu, Feb 15, 2018 at 12:17 PM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: > On Thu, Feb 15, 2018 at 1:03 AM, Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org> wrote: >> On 14/02/18 10:33, Vivek Gautam wrote: >>> >>> On Wed, Feb 14, 2018 at 2:46 PM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> wrote: >>> >>> Adding Jordan to this thread as well. >>> >>>> On Wed, Feb 14, 2018 at 6:13 PM, Vivek Gautam >>>> <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >>>>> >>>>> Hi Tomasz, >>>>> >>>>> On Wed, Feb 14, 2018 at 11:08 AM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> >>>>> wrote: >>>>>> >>>>>> On Wed, Feb 14, 2018 at 1:17 PM, Vivek Gautam >>>>>> <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >>>>>>> >>>>>>> Hi Tomasz, >>>>>>> >>>>>>> On Wed, Feb 14, 2018 at 8:31 AM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> >>>>>>> wrote: >>>>>>>> >>>>>>>> On Wed, Feb 14, 2018 at 11:13 AM, Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> On Tue, Feb 13, 2018 at 8:59 PM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> On Wed, Feb 14, 2018 at 3:03 AM, Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> On Tue, Feb 13, 2018 at 4:10 AM, Tomasz Figa <tfiga-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org> >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Vivek, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for the patch. Please see my comments inline. >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Feb 7, 2018 at 7:31 PM, Vivek Gautam >>>>>>>>>>>> <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> While handling the concerned iommu, there should not be a >>>>>>>>>>>>> need to power control the drm devices from iommu interface. >>>>>>>>>>>>> If these drm devices need to be powered around this time, >>>>>>>>>>>>> the respective drivers should take care of this. >>>>>>>>>>>>> >>>>>>>>>>>>> Replace the pm_runtime_get/put_sync(<drm_device>) with >>>>>>>>>>>>> pm_runtime_get/put_suppliers(<drm_device>) calls, to power-up >>>>>>>>>>>>> the connected iommu through the device link interface. >>>>>>>>>>>>> In case the device link is not setup these get/put_suppliers() >>>>>>>>>>>>> calls will be a no-op, and the iommu driver should take care of >>>>>>>>>>>>> powering on its devices accordingly. >>>>>>>>>>>>> >>>>>>>>>>>>> Signed-off-by: Vivek Gautam <vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> >>>>>>>>>>>>> --- >>>>>>>>>>>>> drivers/gpu/drm/msm/msm_iommu.c | 16 ++++++++-------- >>>>>>>>>>>>> 1 file changed, 8 insertions(+), 8 deletions(-) >>>>>>>>>>>>> >>>>>>>>>>>>> diff --git a/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>>>>>> b/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>>>>>> index b23d33622f37..1ab629bbee69 100644 >>>>>>>>>>>>> --- a/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>>>>>> +++ b/drivers/gpu/drm/msm/msm_iommu.c >>>>>>>>>>>>> @@ -40,9 +40,9 @@ static int msm_iommu_attach(struct msm_mmu >>>>>>>>>>>>> *mmu, const char * const *names, >>>>>>>>>>>>> struct msm_iommu *iommu = to_msm_iommu(mmu); >>>>>>>>>>>>> int ret; >>>>>>>>>>>>> >>>>>>>>>>>>> - pm_runtime_get_sync(mmu->dev); >>>>>>>>>>>>> + pm_runtime_get_suppliers(mmu->dev); >>>>>>>>>>>>> ret = iommu_attach_device(iommu->domain, mmu->dev); >>>>>>>>>>>>> - pm_runtime_put_sync(mmu->dev); >>>>>>>>>>>>> + pm_runtime_put_suppliers(mmu->dev); >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> For me, it looks like a wrong place to handle runtime PM of IOMMU >>>>>>>>>>>> here. iommu_attach_device() calls into IOMMU driver's >>>>>>>>>>>> attach_device() >>>>>>>>>>>> callback and that's where necessary runtime PM gets should >>>>>>>>>>>> happen, if >>>>>>>>>>>> any. In other words, driver A (MSM DRM driver) shouldn't be >>>>>>>>>>>> dealing >>>>>>>>>>>> with power state of device controlled by driver B (ARM SMMU). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Note that we end up having to do the same, because of >>>>>>>>>>> iommu_unmap() >>>>>>>>>>> while DRM driver is powered off.. it might be cleaner if it was >>>>>>>>>>> all >>>>>>>>>>> self contained in the iommu driver, but that would make it so >>>>>>>>>>> other >>>>>>>>>>> drivers couldn't call iommu_unmap() from an irq handler, which is >>>>>>>>>>> apparently something that some of them want to do.. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I'd assume that runtime PM status is already guaranteed to be >>>>>>>>>> active >>>>>>>>>> when the IRQ handler is running, by some other means (e.g. >>>>>>>>>> pm_runtime_get_sync() called earlier, when queuing some work to the >>>>>>>>>> hardware). Otherwise, I'm not sure how a powered down device could >>>>>>>>>> trigger an IRQ. >>>>>>>>>> >>>>>>>>>> So, if the master device power is already on, suppliers should be >>>>>>>>>> powered on as well, thanks to device links. >>>>>>>>>> >>>>>>>>> >>>>>>>>> umm, that is kindof the inverse of the problem.. the problem is >>>>>>>>> things like gpu driver (and v4l2 drivers that import dma-buf's, >>>>>>>>> afaict).. they will potentially call iommu->unmap() when device is >>>>>>>>> not >>>>>>>>> active (due to userspace or things beyond the control of the >>>>>>>>> driver).. >>>>>>>>> so *they* would want iommu to do pm get/put calls. >>>>>>>> >>>>>>>> >>>>>>>> Which is fine and which is actually already done by one of the >>>>>>>> patches >>>>>>>> in this series, not for map/unmap, but probe, add_device, >>>>>>>> remove_device. Having parts of the API doing it inside the callback >>>>>>>> and other parts outside sounds at least inconsistent. >>>>>>>> >>>>>>>>> But other drivers >>>>>>>>> trying to unmap from irq ctx would not. Which is the contradictory >>>>>>>>> requirement that lead to the idea of iommu user powering up iommu >>>>>>>>> for >>>>>>>>> unmap. >>>>>>>> >>>>>>>> >>>>>>>> Sorry, maybe I wasn't clear. My last message was supposed to show >>>>>>>> that >>>>>>>> it's not contradictory at all, because "other drivers trying to unmap >>>>>>>> from irq ctx" would already have called pm_runtime_get_*() earlier >>>>>>>> from a non-irq ctx, which would have also done the same on all the >>>>>>>> linked suppliers, including the IOMMU. The ultimate result would be >>>>>>>> that the map/unmap() of the IOMMU driver calling >>>>>>>> pm_runtime_get_sync() >>>>>>>> would do nothing besides incrementing the reference count. >>>>>>> >>>>>>> >>>>>>> The entire point was to avoid the slowpath that >>>>>>> pm_runtime_get/put_sync() >>>>>>> would add in map/unmap. It would not be correct to add a slowpath in >>>>>>> irq_ctx >>>>>>> for taking care of non-irq_ctx and for the situations where master is >>>>>>> already >>>>>>> powered-off. >>>>>> >>>>>> >>>>>> Correct me if I'm wrong, but I believe that with what I'm proposing >>>>>> there wouldn't be any slow path. >>>>> >>>>> >>>>> Yea, but only when the power domain is irq-safe? And not all platforms >>>>> enable irq-safe power domains. For instance, msm doesn't enable its >>>>> gdsc power domains as irq-safe. >>>>> Is it something i am missing? >>>> >>>> >>>> irq-safe would matter if there would exist a case when the call is >>>> done from IRQ context and the power is off. As I explained in a), it >>>> shouldn't happen. >>> >>> >>> Hi Robin, Will >>> >>> Does adding pm_runtime_get() in map/unmap sounds good to you? >> >> >> Given that we spent significant effort last year removing as much locking as >> we possibly could from the map/unmap path to minimise the significant >> performance impact it was having on networking/storage/etc. workloads, I >> really don't want to introduce more for the sake of one specific use-case, >> so no. > > Could you elaborate on what kind of locking you are concerned about? > As I explained before, the normally happening fast path would lock > dev->power_lock only for the brief moment of incrementing the runtime > PM usage counter. My bad, that's not even it. The atomic usage counter is incremented beforehands, without any locking [1] and the spinlock is acquired only for the sake of validating that device's runtime PM state remained valid indeed [2], which would be the case in the fast path of the same driver doing two mappings in parallel, with the master powered on (and so the SMMU, through device links; if master was not powered on already, powering on the SMMU is unavoidable anyway and it would add much more latency than the spinlock itself). [1] http://elixir.free-electrons.com/linux/v4.16-rc1/source/drivers/base/power/runtime.c#L1028 [2] http://elixir.free-electrons.com/linux/v4.16-rc1/source/drivers/base/power/runtime.c#L613 In any case, I can't imagine this working with V4L2 or anything else relying on any memory management more generic than calling IOMMU API directly from the driver, with the IOMMU device having runtime PM enabled, but without managing the runtime PM from the IOMMU driver's callbacks that need access to the hardware. As I mentioned before, only the IOMMU driver knows when exactly the real hardware access needs to be done (e.g. Rockchip/Exynos don't need to do that for map/unmap if the power is down, but some implementations of SMMU with TLB powered separately might need to do so). Best regards, Tomasz ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5A9B-di9svtiJbvk2hz1U1xo61rTY5vt6AD+KR5iMcG-A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAAFQd5A9B-di9svtiJbvk2hz1U1xo61rTY5vt6AD+KR5iMcG-A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-15 17:14 ` Robin Murphy [not found] ` <7406f1ce-c2c9-a6bd-2886-5a34de45add6-5wv7dgnIgG8@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Robin Murphy @ 2018-02-15 17:14 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, Linux PM, David Airlie, Rafael J. Wysocki, Will Deacon, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS, dri-devel, Linux Kernel Mailing List, Rob Herring, Greg KH, freedreno, Stephen Boyd, linux-arm-msm On 15/02/18 04:17, Tomasz Figa wrote: [...] >> Could you elaborate on what kind of locking you are concerned about? >> As I explained before, the normally happening fast path would lock >> dev->power_lock only for the brief moment of incrementing the runtime >> PM usage counter. > > My bad, that's not even it. > > The atomic usage counter is incremented beforehands, without any > locking [1] and the spinlock is acquired only for the sake of > validating that device's runtime PM state remained valid indeed [2], > which would be the case in the fast path of the same driver doing two > mappings in parallel, with the master powered on (and so the SMMU, > through device links; if master was not powered on already, powering > on the SMMU is unavoidable anyway and it would add much more latency > than the spinlock itself). We now have no locking at all in the map path, and only a per-domain lock around TLB sync in unmap which is unfortunately necessary for correctness; the latter isn't too terrible, since in "serious" hardware it should only be serialising a few cpus serving the same device against each other (e.g. for multiple queues on a single NIC). Putting in a global lock which serialises *all* concurrent map and unmap calls for *all* unrelated devices makes things worse. Period. Even if the lock itself were held for the minimum possible time, i.e. trivially "spin_lock(&lock); spin_unlock(&lock)", the cost of repeatedly bouncing that one cache line around between 96 CPUs across two sockets is not negligible. > [1] http://elixir.free-electrons.com/linux/v4.16-rc1/source/drivers/base/power/runtime.c#L1028 > [2] http://elixir.free-electrons.com/linux/v4.16-rc1/source/drivers/base/power/runtime.c#L613 > > In any case, I can't imagine this working with V4L2 or anything else > relying on any memory management more generic than calling IOMMU API > directly from the driver, with the IOMMU device having runtime PM > enabled, but without managing the runtime PM from the IOMMU driver's > callbacks that need access to the hardware. As I mentioned before, > only the IOMMU driver knows when exactly the real hardware access > needs to be done (e.g. Rockchip/Exynos don't need to do that for > map/unmap if the power is down, but some implementations of SMMU with > TLB powered separately might need to do so). It's worth noting that Exynos and Rockchip are relatively small self-contained IP blocks integrated closely with the interfaces of their relevant master devices; SMMU is an architecture, implementations of which may be large, distributed, and have complex and wildly differing internal topologies. As such, it's a lot harder to make hardware-specific assumptions and/or be correct for all possible cases. Don't get me wrong, I do ultimately agree that the IOMMU driver is the only agent who ultimately knows what calls are going to be necessary for whatever operation it's performing on its own hardware*; it's just that for SMMU it needs to be implemented in a way that has zero impact on the cases where it doesn't matter, because it's not viable to specialise that driver for any particular IP implementation/use-case. Robin. *AFAICS it still makes some sense to have the get_suppliers option as well, though - the IOMMU driver does what it needs for correctness internally, but the external consumer doing something non-standard can can grab and hold the link around multiple calls to short-circuit that. ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <7406f1ce-c2c9-a6bd-2886-5a34de45add6-5wv7dgnIgG8@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <7406f1ce-c2c9-a6bd-2886-5a34de45add6-5wv7dgnIgG8@public.gmane.org> @ 2018-02-16 0:13 ` Tomasz Figa [not found] ` <CAAFQd5DkLtq2w00=Zd4sMDB4QOWqi7R-zgydECJXLdTmaHty+g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-16 0:13 UTC (permalink / raw) To: Robin Murphy Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, jcrouse-sgV2jX0FEOL9JmXXK+q4OQ, Linux PM, David Airlie, Rafael J. Wysocki, Joerg Roedel, Will Deacon, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS, dri-devel, Linux Kernel Mailing List, Rob Clark, Rob Herring, Vivek Gautam, Greg KH, freedreno, Stephen Boyd, linux-arm-msm On Fri, Feb 16, 2018 at 2:14 AM, Robin Murphy <robin.murphy@arm.com> wrote: > On 15/02/18 04:17, Tomasz Figa wrote: > [...] >>> >>> Could you elaborate on what kind of locking you are concerned about? >>> As I explained before, the normally happening fast path would lock >>> dev->power_lock only for the brief moment of incrementing the runtime >>> PM usage counter. >> >> >> My bad, that's not even it. >> >> The atomic usage counter is incremented beforehands, without any >> locking [1] and the spinlock is acquired only for the sake of >> validating that device's runtime PM state remained valid indeed [2], >> which would be the case in the fast path of the same driver doing two >> mappings in parallel, with the master powered on (and so the SMMU, >> through device links; if master was not powered on already, powering >> on the SMMU is unavoidable anyway and it would add much more latency >> than the spinlock itself). > > > We now have no locking at all in the map path, and only a per-domain lock > around TLB sync in unmap which is unfortunately necessary for correctness; > the latter isn't too terrible, since in "serious" hardware it should only be > serialising a few cpus serving the same device against each other (e.g. for > multiple queues on a single NIC). > > Putting in a global lock which serialises *all* concurrent map and unmap > calls for *all* unrelated devices makes things worse. Period. Even if the > lock itself were held for the minimum possible time, i.e. trivially > "spin_lock(&lock); spin_unlock(&lock)", the cost of repeatedly bouncing that > one cache line around between 96 CPUs across two sockets is not negligible. Fair enough. Note that we're in a quite interesting situation now: a) We need to have runtime PM enabled on Qualcomm SoC to have power properly managed, b) We need to have lock-free map/unmap on such distributed systems, c) If runtime PM is enabled, we need to call into runtime PM from any code that does hardware accesses, otherwise the IOMMU API (and so DMA API and then any V4L2 driver) becomes unusable. I can see one more way that could potentially let us have all the three. How about enabling runtime PM only on selected implementations (e.g. qcom,smmu) and then having all the runtime PM calls surrounded with if (pm_runtime_enabled()), which is lockless? > >> [1] >> http://elixir.free-electrons.com/linux/v4.16-rc1/source/drivers/base/power/runtime.c#L1028 >> [2] >> http://elixir.free-electrons.com/linux/v4.16-rc1/source/drivers/base/power/runtime.c#L613 >> >> In any case, I can't imagine this working with V4L2 or anything else >> relying on any memory management more generic than calling IOMMU API >> directly from the driver, with the IOMMU device having runtime PM >> enabled, but without managing the runtime PM from the IOMMU driver's >> callbacks that need access to the hardware. As I mentioned before, >> only the IOMMU driver knows when exactly the real hardware access >> needs to be done (e.g. Rockchip/Exynos don't need to do that for >> map/unmap if the power is down, but some implementations of SMMU with >> TLB powered separately might need to do so). > > > It's worth noting that Exynos and Rockchip are relatively small > self-contained IP blocks integrated closely with the interfaces of their > relevant master devices; SMMU is an architecture, implementations of which > may be large, distributed, and have complex and wildly differing internal > topologies. As such, it's a lot harder to make hardware-specific assumptions > and/or be correct for all possible cases. > > Don't get me wrong, I do ultimately agree that the IOMMU driver is the only > agent who ultimately knows what calls are going to be necessary for whatever > operation it's performing on its own hardware*; it's just that for SMMU it > needs to be implemented in a way that has zero impact on the cases where it > doesn't matter, because it's not viable to specialise that driver for any > particular IP implementation/use-case. Still, exactly the same holds for the low power embedded use cases, where we strive for the lowest possible power consumption, while maintaining performance levels high as well. And so the SMMU code is expected to also work with our use cases, such as V4L2 or DRM drivers. Since these points don't hold for current SMMU code, I could say that the it has been already specialized for large, distributed implementations. Best regards, Tomasz _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5DkLtq2w00=Zd4sMDB4QOWqi7R-zgydECJXLdTmaHty+g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAAFQd5DkLtq2w00=Zd4sMDB4QOWqi7R-zgydECJXLdTmaHty+g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-22 8:13 ` Tomasz Figa [not found] ` <CAAFQd5CQoDqunAunwoVo7W=QXa=ET=eJ2s_j9j+3YgAR2EGgCg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-22 13:45 ` Robin Murphy 1 sibling, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-22 8:13 UTC (permalink / raw) To: Robin Murphy Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, jcrouse-sgV2jX0FEOL9JmXXK+q4OQ, Linux PM, David Airlie, Rafael J. Wysocki, Joerg Roedel, Will Deacon, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS, dri-devel, Linux Kernel Mailing List, Rob Clark, Rob Herring, Vivek Gautam, Greg KH, freedreno, Stephen Boyd, linux-arm-msm On Fri, Feb 16, 2018 at 9:13 AM, Tomasz Figa <tfiga@chromium.org> wrote: > On Fri, Feb 16, 2018 at 2:14 AM, Robin Murphy <robin.murphy@arm.com> wrote: >> On 15/02/18 04:17, Tomasz Figa wrote: >> [...] >>>> >>>> Could you elaborate on what kind of locking you are concerned about? >>>> As I explained before, the normally happening fast path would lock >>>> dev->power_lock only for the brief moment of incrementing the runtime >>>> PM usage counter. >>> >>> >>> My bad, that's not even it. >>> >>> The atomic usage counter is incremented beforehands, without any >>> locking [1] and the spinlock is acquired only for the sake of >>> validating that device's runtime PM state remained valid indeed [2], >>> which would be the case in the fast path of the same driver doing two >>> mappings in parallel, with the master powered on (and so the SMMU, >>> through device links; if master was not powered on already, powering >>> on the SMMU is unavoidable anyway and it would add much more latency >>> than the spinlock itself). >> >> >> We now have no locking at all in the map path, and only a per-domain lock >> around TLB sync in unmap which is unfortunately necessary for correctness; >> the latter isn't too terrible, since in "serious" hardware it should only be >> serialising a few cpus serving the same device against each other (e.g. for >> multiple queues on a single NIC). >> >> Putting in a global lock which serialises *all* concurrent map and unmap >> calls for *all* unrelated devices makes things worse. Period. Even if the >> lock itself were held for the minimum possible time, i.e. trivially >> "spin_lock(&lock); spin_unlock(&lock)", the cost of repeatedly bouncing that >> one cache line around between 96 CPUs across two sockets is not negligible. > > Fair enough. Note that we're in a quite interesting situation now: > a) We need to have runtime PM enabled on Qualcomm SoC to have power > properly managed, > b) We need to have lock-free map/unmap on such distributed systems, > c) If runtime PM is enabled, we need to call into runtime PM from any > code that does hardware accesses, otherwise the IOMMU API (and so DMA > API and then any V4L2 driver) becomes unusable. > > I can see one more way that could potentially let us have all the > three. How about enabling runtime PM only on selected implementations > (e.g. qcom,smmu) and then having all the runtime PM calls surrounded > with if (pm_runtime_enabled()), which is lockless? > Sorry for pinging, but any opinion on this kind of approach? Best regards, Tomasz >> >>> [1] >>> http://elixir.free-electrons.com/linux/v4.16-rc1/source/drivers/base/power/runtime.c#L1028 >>> [2] >>> http://elixir.free-electrons.com/linux/v4.16-rc1/source/drivers/base/power/runtime.c#L613 >>> >>> In any case, I can't imagine this working with V4L2 or anything else >>> relying on any memory management more generic than calling IOMMU API >>> directly from the driver, with the IOMMU device having runtime PM >>> enabled, but without managing the runtime PM from the IOMMU driver's >>> callbacks that need access to the hardware. As I mentioned before, >>> only the IOMMU driver knows when exactly the real hardware access >>> needs to be done (e.g. Rockchip/Exynos don't need to do that for >>> map/unmap if the power is down, but some implementations of SMMU with >>> TLB powered separately might need to do so). >> >> >> It's worth noting that Exynos and Rockchip are relatively small >> self-contained IP blocks integrated closely with the interfaces of their >> relevant master devices; SMMU is an architecture, implementations of which >> may be large, distributed, and have complex and wildly differing internal >> topologies. As such, it's a lot harder to make hardware-specific assumptions >> and/or be correct for all possible cases. >> >> Don't get me wrong, I do ultimately agree that the IOMMU driver is the only >> agent who ultimately knows what calls are going to be necessary for whatever >> operation it's performing on its own hardware*; it's just that for SMMU it >> needs to be implemented in a way that has zero impact on the cases where it >> doesn't matter, because it's not viable to specialise that driver for any >> particular IP implementation/use-case. > > Still, exactly the same holds for the low power embedded use cases, > where we strive for the lowest possible power consumption, while > maintaining performance levels high as well. And so the SMMU code is > expected to also work with our use cases, such as V4L2 or DRM drivers. > Since these points don't hold for current SMMU code, I could say that > the it has been already specialized for large, distributed > implementations. > > Best regards, > Tomasz _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5CQoDqunAunwoVo7W=QXa=ET=eJ2s_j9j+3YgAR2EGgCg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAAFQd5CQoDqunAunwoVo7W=QXa=ET=eJ2s_j9j+3YgAR2EGgCg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-22 13:30 ` Rob Clark 0 siblings, 0 replies; 53+ messages in thread From: Rob Clark @ 2018-02-22 13:30 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, Linux PM, David Airlie, Will Deacon, Joerg Roedel, Rafael J. Wysocki, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS, dri-devel, Linux Kernel Mailing List, Jordan Crouse, Rob Herring, Vivek Gautam, Greg KH, freedreno, Robin Murphy, Stephen Boyd, linux-arm-msm On Thu, Feb 22, 2018 at 3:13 AM, Tomasz Figa <tfiga@chromium.org> wrote: > On Fri, Feb 16, 2018 at 9:13 AM, Tomasz Figa <tfiga@chromium.org> wrote: >> On Fri, Feb 16, 2018 at 2:14 AM, Robin Murphy <robin.murphy@arm.com> wrote: >>> On 15/02/18 04:17, Tomasz Figa wrote: >>> [...] >>>>> >>>>> Could you elaborate on what kind of locking you are concerned about? >>>>> As I explained before, the normally happening fast path would lock >>>>> dev->power_lock only for the brief moment of incrementing the runtime >>>>> PM usage counter. >>>> >>>> >>>> My bad, that's not even it. >>>> >>>> The atomic usage counter is incremented beforehands, without any >>>> locking [1] and the spinlock is acquired only for the sake of >>>> validating that device's runtime PM state remained valid indeed [2], >>>> which would be the case in the fast path of the same driver doing two >>>> mappings in parallel, with the master powered on (and so the SMMU, >>>> through device links; if master was not powered on already, powering >>>> on the SMMU is unavoidable anyway and it would add much more latency >>>> than the spinlock itself). >>> >>> >>> We now have no locking at all in the map path, and only a per-domain lock >>> around TLB sync in unmap which is unfortunately necessary for correctness; >>> the latter isn't too terrible, since in "serious" hardware it should only be >>> serialising a few cpus serving the same device against each other (e.g. for >>> multiple queues on a single NIC). >>> >>> Putting in a global lock which serialises *all* concurrent map and unmap >>> calls for *all* unrelated devices makes things worse. Period. Even if the >>> lock itself were held for the minimum possible time, i.e. trivially >>> "spin_lock(&lock); spin_unlock(&lock)", the cost of repeatedly bouncing that >>> one cache line around between 96 CPUs across two sockets is not negligible. >> >> Fair enough. Note that we're in a quite interesting situation now: >> a) We need to have runtime PM enabled on Qualcomm SoC to have power >> properly managed, >> b) We need to have lock-free map/unmap on such distributed systems, >> c) If runtime PM is enabled, we need to call into runtime PM from any >> code that does hardware accesses, otherwise the IOMMU API (and so DMA >> API and then any V4L2 driver) becomes unusable. >> >> I can see one more way that could potentially let us have all the >> three. How about enabling runtime PM only on selected implementations >> (e.g. qcom,smmu) and then having all the runtime PM calls surrounded >> with if (pm_runtime_enabled()), which is lockless? >> > > Sorry for pinging, but any opinion on this kind of approach? > It is ok by me, for whatever that is worth BR, -R _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAAFQd5DkLtq2w00=Zd4sMDB4QOWqi7R-zgydECJXLdTmaHty+g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-22 8:13 ` Tomasz Figa @ 2018-02-22 13:45 ` Robin Murphy 2018-02-22 14:12 ` Tomasz Figa 1 sibling, 1 reply; 53+ messages in thread From: Robin Murphy @ 2018-02-22 13:45 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, jcrouse-sgV2jX0FEOL9JmXXK+q4OQ, Linux PM, David Airlie, Rafael J. Wysocki, Joerg Roedel, Will Deacon, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS, dri-devel, Linux Kernel Mailing List, Rob Clark, Rob Herring, Vivek Gautam, Greg KH, freedreno, Stephen Boyd, linux-arm-msm [sorry, I had intended to reply sooner but clearly forgot] On 16/02/18 00:13, Tomasz Figa wrote: > On Fri, Feb 16, 2018 at 2:14 AM, Robin Murphy <robin.murphy@arm.com> wrote: >> On 15/02/18 04:17, Tomasz Figa wrote: >> [...] >>>> >>>> Could you elaborate on what kind of locking you are concerned about? >>>> As I explained before, the normally happening fast path would lock >>>> dev->power_lock only for the brief moment of incrementing the runtime >>>> PM usage counter. >>> >>> >>> My bad, that's not even it. >>> >>> The atomic usage counter is incremented beforehands, without any >>> locking [1] and the spinlock is acquired only for the sake of >>> validating that device's runtime PM state remained valid indeed [2], >>> which would be the case in the fast path of the same driver doing two >>> mappings in parallel, with the master powered on (and so the SMMU, >>> through device links; if master was not powered on already, powering >>> on the SMMU is unavoidable anyway and it would add much more latency >>> than the spinlock itself). >> >> >> We now have no locking at all in the map path, and only a per-domain lock >> around TLB sync in unmap which is unfortunately necessary for correctness; >> the latter isn't too terrible, since in "serious" hardware it should only be >> serialising a few cpus serving the same device against each other (e.g. for >> multiple queues on a single NIC). >> >> Putting in a global lock which serialises *all* concurrent map and unmap >> calls for *all* unrelated devices makes things worse. Period. Even if the >> lock itself were held for the minimum possible time, i.e. trivially >> "spin_lock(&lock); spin_unlock(&lock)", the cost of repeatedly bouncing that >> one cache line around between 96 CPUs across two sockets is not negligible. > > Fair enough. Note that we're in a quite interesting situation now: > a) We need to have runtime PM enabled on Qualcomm SoC to have power > properly managed, > b) We need to have lock-free map/unmap on such distributed systems, > c) If runtime PM is enabled, we need to call into runtime PM from any > code that does hardware accesses, otherwise the IOMMU API (and so DMA > API and then any V4L2 driver) becomes unusable. > > I can see one more way that could potentially let us have all the > three. How about enabling runtime PM only on selected implementations > (e.g. qcom,smmu) and then having all the runtime PM calls surrounded > with if (pm_runtime_enabled()), which is lockless? Yes, that's the kind of thing I was gravitating towards - my vague thought was adding some flag to the smmu_domain, but pm_runtime_enabled() does look conceptually a lot cleaner. >> >>> [1] >>> http://elixir.free-electrons.com/linux/v4.16-rc1/source/drivers/base/power/runtime.c#L1028 >>> [2] >>> http://elixir.free-electrons.com/linux/v4.16-rc1/source/drivers/base/power/runtime.c#L613 >>> >>> In any case, I can't imagine this working with V4L2 or anything else >>> relying on any memory management more generic than calling IOMMU API >>> directly from the driver, with the IOMMU device having runtime PM >>> enabled, but without managing the runtime PM from the IOMMU driver's >>> callbacks that need access to the hardware. As I mentioned before, >>> only the IOMMU driver knows when exactly the real hardware access >>> needs to be done (e.g. Rockchip/Exynos don't need to do that for >>> map/unmap if the power is down, but some implementations of SMMU with >>> TLB powered separately might need to do so). >> >> >> It's worth noting that Exynos and Rockchip are relatively small >> self-contained IP blocks integrated closely with the interfaces of their >> relevant master devices; SMMU is an architecture, implementations of which >> may be large, distributed, and have complex and wildly differing internal >> topologies. As such, it's a lot harder to make hardware-specific assumptions >> and/or be correct for all possible cases. >> >> Don't get me wrong, I do ultimately agree that the IOMMU driver is the only >> agent who ultimately knows what calls are going to be necessary for whatever >> operation it's performing on its own hardware*; it's just that for SMMU it >> needs to be implemented in a way that has zero impact on the cases where it >> doesn't matter, because it's not viable to specialise that driver for any >> particular IP implementation/use-case. > > Still, exactly the same holds for the low power embedded use cases, > where we strive for the lowest possible power consumption, while > maintaining performance levels high as well. And so the SMMU code is > expected to also work with our use cases, such as V4L2 or DRM drivers. > Since these points don't hold for current SMMU code, I could say that > the it has been already specialized for large, distributed > implementations. Heh, really it's specialised for ease of maintenance in terms of doing as little as we can get away with, but for what we have implemented, fast code does save CPU cycles and power on embedded systems too ;) Robin. _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers 2018-02-22 13:45 ` Robin Murphy @ 2018-02-22 14:12 ` Tomasz Figa [not found] ` <CAAFQd5CzOJ0oS8LLcKP-DgyXSXgzafg5CWPAvnR-QDPS+DZAUg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 53+ messages in thread From: Tomasz Figa @ 2018-02-22 14:12 UTC (permalink / raw) To: Robin Murphy, Vivek Gautam Cc: Mark Rutland, devicetree, Linux PM, David Airlie, Will Deacon, Rafael J. Wysocki, dri-devel, Linux Kernel Mailing List, list@263.net:IOMMU DRIVERS, Rob Herring, Greg KH, freedreno, Stephen Boyd, linux-arm-msm On Thu, Feb 22, 2018 at 10:45 PM, Robin Murphy <robin.murphy@arm.com> wrote: > [sorry, I had intended to reply sooner but clearly forgot] > > > On 16/02/18 00:13, Tomasz Figa wrote: >> >> On Fri, Feb 16, 2018 at 2:14 AM, Robin Murphy <robin.murphy@arm.com> >> wrote: >>> >>> On 15/02/18 04:17, Tomasz Figa wrote: >>> [...] >>>>> >>>>> >>>>> Could you elaborate on what kind of locking you are concerned about? >>>>> As I explained before, the normally happening fast path would lock >>>>> dev->power_lock only for the brief moment of incrementing the runtime >>>>> PM usage counter. >>>> >>>> >>>> >>>> My bad, that's not even it. >>>> >>>> The atomic usage counter is incremented beforehands, without any >>>> locking [1] and the spinlock is acquired only for the sake of >>>> validating that device's runtime PM state remained valid indeed [2], >>>> which would be the case in the fast path of the same driver doing two >>>> mappings in parallel, with the master powered on (and so the SMMU, >>>> through device links; if master was not powered on already, powering >>>> on the SMMU is unavoidable anyway and it would add much more latency >>>> than the spinlock itself). >>> >>> >>> >>> We now have no locking at all in the map path, and only a per-domain lock >>> around TLB sync in unmap which is unfortunately necessary for >>> correctness; >>> the latter isn't too terrible, since in "serious" hardware it should only >>> be >>> serialising a few cpus serving the same device against each other (e.g. >>> for >>> multiple queues on a single NIC). >>> >>> Putting in a global lock which serialises *all* concurrent map and unmap >>> calls for *all* unrelated devices makes things worse. Period. Even if the >>> lock itself were held for the minimum possible time, i.e. trivially >>> "spin_lock(&lock); spin_unlock(&lock)", the cost of repeatedly bouncing >>> that >>> one cache line around between 96 CPUs across two sockets is not >>> negligible. >> >> >> Fair enough. Note that we're in a quite interesting situation now: >> a) We need to have runtime PM enabled on Qualcomm SoC to have power >> properly managed, >> b) We need to have lock-free map/unmap on such distributed systems, >> c) If runtime PM is enabled, we need to call into runtime PM from any >> code that does hardware accesses, otherwise the IOMMU API (and so DMA >> API and then any V4L2 driver) becomes unusable. >> >> I can see one more way that could potentially let us have all the >> three. How about enabling runtime PM only on selected implementations >> (e.g. qcom,smmu) and then having all the runtime PM calls surrounded >> with if (pm_runtime_enabled()), which is lockless? > > > Yes, that's the kind of thing I was gravitating towards - my vague thought > was adding some flag to the smmu_domain, but pm_runtime_enabled() does look > conceptually a lot cleaner. Great, thanks. Looks like we're in agreement now. \o/ Vivek, does this sound reasonable to you? Best regards, Tomasz _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 53+ messages in thread
[parent not found: <CAAFQd5CzOJ0oS8LLcKP-DgyXSXgzafg5CWPAvnR-QDPS+DZAUg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers [not found] ` <CAAFQd5CzOJ0oS8LLcKP-DgyXSXgzafg5CWPAvnR-QDPS+DZAUg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2018-02-22 17:24 ` Vivek Gautam 0 siblings, 0 replies; 53+ messages in thread From: Vivek Gautam @ 2018-02-22 17:24 UTC (permalink / raw) To: Tomasz Figa Cc: Mark Rutland, devicetree-u79uwXL29TY76Z2rM5mHXA, jcrouse-sgV2jX0FEOL9JmXXK+q4OQ, Linux PM, David Airlie, Rafael J. Wysocki, Joerg Roedel, Will Deacon, Rob Clark, dri-devel, Linux Kernel Mailing List, list-Y9sIeH5OGRo@public.gmane.org:IOMMU DRIVERS, Rob Herring, Greg KH, freedreno, Robin Murphy, Stephen Boyd, linux-arm-msm Hi, On Thu, Feb 22, 2018 at 7:42 PM, Tomasz Figa <tfiga@chromium.org> wrote: > On Thu, Feb 22, 2018 at 10:45 PM, Robin Murphy <robin.murphy@arm.com> wrote: >> [sorry, I had intended to reply sooner but clearly forgot] >> >> >> On 16/02/18 00:13, Tomasz Figa wrote: >>> >>> On Fri, Feb 16, 2018 at 2:14 AM, Robin Murphy <robin.murphy@arm.com> >>> wrote: >>>> >>>> On 15/02/18 04:17, Tomasz Figa wrote: >>>> [...] >>>>>> >>>>>> >>>>>> Could you elaborate on what kind of locking you are concerned about? >>>>>> As I explained before, the normally happening fast path would lock >>>>>> dev->power_lock only for the brief moment of incrementing the runtime >>>>>> PM usage counter. >>>>> >>>>> >>>>> >>>>> My bad, that's not even it. >>>>> >>>>> The atomic usage counter is incremented beforehands, without any >>>>> locking [1] and the spinlock is acquired only for the sake of >>>>> validating that device's runtime PM state remained valid indeed [2], >>>>> which would be the case in the fast path of the same driver doing two >>>>> mappings in parallel, with the master powered on (and so the SMMU, >>>>> through device links; if master was not powered on already, powering >>>>> on the SMMU is unavoidable anyway and it would add much more latency >>>>> than the spinlock itself). >>>> >>>> >>>> >>>> We now have no locking at all in the map path, and only a per-domain lock >>>> around TLB sync in unmap which is unfortunately necessary for >>>> correctness; >>>> the latter isn't too terrible, since in "serious" hardware it should only >>>> be >>>> serialising a few cpus serving the same device against each other (e.g. >>>> for >>>> multiple queues on a single NIC). >>>> >>>> Putting in a global lock which serialises *all* concurrent map and unmap >>>> calls for *all* unrelated devices makes things worse. Period. Even if the >>>> lock itself were held for the minimum possible time, i.e. trivially >>>> "spin_lock(&lock); spin_unlock(&lock)", the cost of repeatedly bouncing >>>> that >>>> one cache line around between 96 CPUs across two sockets is not >>>> negligible. >>> >>> >>> Fair enough. Note that we're in a quite interesting situation now: >>> a) We need to have runtime PM enabled on Qualcomm SoC to have power >>> properly managed, >>> b) We need to have lock-free map/unmap on such distributed systems, >>> c) If runtime PM is enabled, we need to call into runtime PM from any >>> code that does hardware accesses, otherwise the IOMMU API (and so DMA >>> API and then any V4L2 driver) becomes unusable. >>> >>> I can see one more way that could potentially let us have all the >>> three. How about enabling runtime PM only on selected implementations >>> (e.g. qcom,smmu) and then having all the runtime PM calls surrounded >>> with if (pm_runtime_enabled()), which is lockless? >> >> >> Yes, that's the kind of thing I was gravitating towards - my vague thought >> was adding some flag to the smmu_domain, but pm_runtime_enabled() does look >> conceptually a lot cleaner. > > Great, thanks. Looks like we're in agreement now. \o/ > > Vivek, does this sound reasonable to you? Yea, sound good to me. I will respin the patches. Thanks & Regards Vivek > > Best regards, > Tomasz -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation _______________________________________________ Freedreno mailing list Freedreno@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/freedreno ^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2018-02-23 17:43 UTC | newest] Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-02-07 10:31 [PATCH v7 0/6] iommu/arm-smmu: Add runtime pm/sleep support Vivek Gautam [not found] ` <1517999482-17317-1-git-send-email-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> 2018-02-07 10:31 ` [PATCH v7 1/6] base: power: runtime: Export pm_runtime_get/put_suppliers Vivek Gautam 2018-02-13 7:44 ` Tomasz Figa [not found] ` <CAAFQd5BmroRf-C8dQkvTKHWK1psGnNi1t7g-q=Xce6KjrGTsdw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-13 12:00 ` Robin Murphy 2018-02-13 12:54 ` Tomasz Figa 2018-02-13 13:37 ` Robin Murphy 2018-02-07 10:31 ` [PATCH v7 2/6] iommu/arm-smmu: Add pm_runtime/sleep ops Vivek Gautam 2018-02-13 8:03 ` Tomasz Figa [not found] ` <CAAFQd5B2u8RL-tdB3qgPxVUcXnsBSEhXRBZWxqO-w6rYKAiOtg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-13 10:25 ` Vivek Gautam 2018-02-14 3:45 ` Tomasz Figa 2018-02-07 10:31 ` [PATCH v7 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device Vivek Gautam 2018-02-13 8:24 ` Tomasz Figa [not found] ` <CAAFQd5DxYhkK61VDAesby6bT+FtG2nqsbHQRvxkhrsSS0KWtog-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-13 12:57 ` Robin Murphy [not found] ` <906051dd-8898-ec6f-5ad4-3f37716292cf-5wv7dgnIgG8@public.gmane.org> 2018-02-13 13:52 ` Tomasz Figa [not found] ` <CAAFQd5DJtQYPg5S3Ep2bK27+D5rQiKuA-uPfMDUon3FudmGF0w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-14 8:24 ` Vivek Gautam 2018-02-14 8:28 ` Vivek Gautam [not found] ` <1517999482-17317-4-git-send-email-vivek.gautam-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org> 2018-02-22 23:52 ` Jordan Crouse [not found] ` <20180222235200.GA18743-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org> 2018-02-23 10:36 ` Vivek Gautam [not found] ` <CAFp+6iGQ5Vckui14Jb=V0uk_Pjes95hOxo=KBijR4yxPeDDzFg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-23 15:40 ` [Freedreno] " Jordan Crouse [not found] ` <20180223154048.GB18743-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org> 2018-02-23 17:43 ` Vivek Gautam 2018-02-07 10:31 ` [PATCH v7 4/6] iommu/arm-smmu: Add the device_link between masters and smmu Vivek Gautam 2018-02-13 8:31 ` Tomasz Figa [not found] ` <CAAFQd5BQAs9=N27_Z0pJNSrndFY3vFin2KBP44UtiW+tMXy5nQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-13 10:14 ` Vivek Gautam 2018-02-07 10:31 ` [PATCH v7 5/6] iommu/arm-smmu: Add support for qcom, smmu-v2 variant Vivek Gautam 2018-02-09 10:57 ` [PATCH v8 5/6] iommu/arm-smmu: Add support for qcom,smmu-v2 variant Vivek Gautam 2018-02-13 8:57 ` Tomasz Figa 2018-02-07 10:31 ` [PATCH v7 6/6] drm/msm: iommu: Replace runtime calls with runtime suppliers Vivek Gautam 2018-02-13 9:10 ` Tomasz Figa [not found] ` <CAAFQd5AjopiX6fDgD+mO+-+d0yj-swEnVCNvccWRBSMO+XVJkA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-13 16:42 ` Jordan Crouse 2018-02-14 3:31 ` Tomasz Figa [not found] ` <CAAFQd5CjQRFATfh-mRQv5J=WefYuxBVTkk=Ju09FoqA-Or5Cvg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-14 15:48 ` Jordan Crouse [not found] ` <20180214154850.GA25422-9PYrDHPZ2Orvke4nUoYGnHL1okKdlPRT@public.gmane.org> 2018-02-14 16:12 ` [Freedreno] " Rob Clark 2018-02-15 4:09 ` Tomasz Figa [not found] ` <CAAFQd5C-9mbd3hDSvz10a1oiO0--FT-L4EpsAYcALxxUvk6Fjg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-15 14:14 ` Rob Clark 2018-02-13 18:03 ` Rob Clark 2018-02-14 1:59 ` Tomasz Figa [not found] ` <CAAFQd5BKRumpEfAKNF_RKS-ZZ8D671DfOz4vB2+w1SV3aG9NxQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-14 2:13 ` Rob Clark [not found] ` <CAF6AEGuNZJKtwGZ5mLfqNND2jtU+HYM11UONfAtVTzoM0QVpdg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-14 3:01 ` Tomasz Figa [not found] ` <CAAFQd5BZJ1G0RG32hYErNzPRvisBhhiSNCBsjbzfm0WzO=DnsQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-14 4:17 ` Vivek Gautam [not found] ` <CAFp+6iHaycK=CcE1S15EeuMkaw8LnW0ebptU0hM6tUtWdeEOtA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-14 5:38 ` Tomasz Figa [not found] ` <CAAFQd5Afj-Bj+3wHwmF2tT7y=46EsYEtO_mXfY6stXBgHutEUg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-14 9:13 ` Vivek Gautam [not found] ` <CAFp+6iGX6pr+MdPSSHHG=qOnhHky_8OHiDqAcJ9UudEUv=JMHg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-14 9:16 ` Tomasz Figa [not found] ` <CAAFQd5DiwAugGnPOTw0+XrEfef9x-n-vx59JFuXpNawjiXHwCw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-14 10:33 ` Vivek Gautam [not found] ` <CAFp+6iEW0faeHDfzN_F1bRrHGcVo3sPCk4HSY=t9dnEvHkDkYw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-14 16:03 ` Robin Murphy 2018-02-15 3:17 ` Tomasz Figa [not found] ` <CAAFQd5AmG1zSm+CouXOCJbs8SNGFk1-RqfU1nWGjMGJMB-qfvw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-15 4:17 ` Tomasz Figa [not found] ` <CAAFQd5A9B-di9svtiJbvk2hz1U1xo61rTY5vt6AD+KR5iMcG-A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-15 17:14 ` Robin Murphy [not found] ` <7406f1ce-c2c9-a6bd-2886-5a34de45add6-5wv7dgnIgG8@public.gmane.org> 2018-02-16 0:13 ` Tomasz Figa [not found] ` <CAAFQd5DkLtq2w00=Zd4sMDB4QOWqi7R-zgydECJXLdTmaHty+g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-22 8:13 ` Tomasz Figa [not found] ` <CAAFQd5CQoDqunAunwoVo7W=QXa=ET=eJ2s_j9j+3YgAR2EGgCg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-22 13:30 ` Rob Clark 2018-02-22 13:45 ` Robin Murphy 2018-02-22 14:12 ` Tomasz Figa [not found] ` <CAAFQd5CzOJ0oS8LLcKP-DgyXSXgzafg5CWPAvnR-QDPS+DZAUg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2018-02-22 17:24 ` Vivek Gautam
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).