From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.6 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D8EEC46475 for ; Tue, 23 Oct 2018 07:45:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4EF6C20671 for ; Tue, 23 Oct 2018 07:45:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="key not found in DNS" (0-bit key) header.d=codeaurora.org header.i=@codeaurora.org header.b="EGaSWFJB"; dkim=fail reason="key not found in DNS" (0-bit key) header.d=codeaurora.org header.i=@codeaurora.org header.b="bvxArwO5" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4EF6C20671 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727652AbeJWQHy (ORCPT ); Tue, 23 Oct 2018 12:07:54 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:43798 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727023AbeJWQHy (ORCPT ); Tue, 23 Oct 2018 12:07:54 -0400 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 969FE6021A; Tue, 23 Oct 2018 07:45:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1540280741; bh=kmXqwHARdD6bdyLK8X2ly2tegMqi+eAwPBrLnKglfsk=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=EGaSWFJBLNCv1XwzEf7MKcK3DnbZknC4UUwRCdLgKXoAEDjlA3FfCDwg6ZwMuUphh nPeDFxKvcFiaGXJrTKhfjoRCrHTqhn1tp0rx2sPbfamy0ODXya7ppp8yVhLby8kmrZ PXLBOJ8k3/wH1QbewnKcMSy2lF3NwopotLEUBo6M= Received: from mail-qk1-f173.google.com (mail-qk1-f173.google.com [209.85.222.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: vivek.gautam@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 86E1C6021A for ; Tue, 23 Oct 2018 07:45:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1540280740; bh=kmXqwHARdD6bdyLK8X2ly2tegMqi+eAwPBrLnKglfsk=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=bvxArwO54hu06k6JVtuFWSDiMim6Hw+nFcEGdr8Tkdtfx6U+sDjRguiANtR41dkrH Goxhjc5qnsIFiOldMkT3xyZw6re3gTPKxQPCJ+rCPWsxilu7YSCN5+6NB7F2EiORA1 W/zU71fpDgKrp7F0fugaDTEhi5JnNf2nS/12xCzg= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 86E1C6021A Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=vivek.gautam@codeaurora.org Received: by mail-qk1-f173.google.com with SMTP id a193-v6so208392qkc.13 for ; Tue, 23 Oct 2018 00:45:40 -0700 (PDT) X-Gm-Message-State: ABuFfoiU8BWVLhSafqjvX9LkuuUkL21/emPHWatBFHipGsMdIfLEolv6 j32K0Fu58i2cKwKz5QVAo1Qqzafxuq6Y3BR34JA= X-Google-Smtp-Source: ACcGV63f0cXDd4CH3KdykQb1w/haE3BTVf2xAlZ/SNo8lg6B87uU8034AUKe9280WRKPwFThu86rcNAhq0qVjhNQtLo= X-Received: by 2002:a37:4952:: with SMTP id w79-v6mr45071017qka.85.1540280739774; Tue, 23 Oct 2018 00:45:39 -0700 (PDT) MIME-Version: 1.0 References: <20180910062551.28175-1-vivek.gautam@codeaurora.org> <20180910062551.28175-5-vivek.gautam@codeaurora.org> <29fd7e9e-708b-b884-4de1-ecc141f41692@arm.com> In-Reply-To: <29fd7e9e-708b-b884-4de1-ecc141f41692@arm.com> From: Vivek Gautam Date: Tue, 23 Oct 2018 13:15:27 +0530 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2 4/4] iommu/arm-smmu: Add support to handle Qcom's TLBI serialization errata To: Robin Murphy Cc: "list@263.net:IOMMU DRIVERS , Joerg Roedel ," , Andy Gross , Will Deacon , Bjorn Andersson , "list@263.net:IOMMU DRIVERS , Joerg Roedel ," , Linux ARM , David Brown , swboyd@chromium.org, open list , pratikp@codeaurora.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Robin, On Tue, Sep 25, 2018 at 6:01 PM Robin Murphy wrote: > > On 10/09/18 07:25, Vivek Gautam wrote: > > Qcom's implementation of arm,mmu-500 require to serialize all > > TLB invalidations for context banks. > > What does "serailize all TLB invalidations" actually mean, because it's > not entirely clear from context, and furthermore this patch appears to > behave subtly differently to the downstream code so I'm really > struggling to figure out whether it's actually doing what it's intended > to do. Adding Pratik Patel from downstream team. Thanks for taking a look at this. We want to space out the TLB invalidation and then the workaround to toggle wait-safe logic would let the safe checks in HW work and only allow invalidation to occur when device is expected to not run into underruns. > > In case the TLB invalidation requests don't go through the first > > time, there's a way to disable/enable the wait for safe logic. > > Disabling this logic expadites the TLBIs. > > > > Different bootloaders with their access control policies allow this > > register access differntly. With one, we should be able to directly > > make qcom-scm call to do io read/write, while with other we should > > use the specific SCM command to send request to do the complete > > register configuration. > > A separate device tree flag for arm-smmu will allow to identify > > which firmware configuration of the two mentioned above we use. > > > > Signed-off-by: Vivek Gautam > > --- > > drivers/iommu/arm-smmu-regs.h | 2 + > > drivers/iommu/arm-smmu.c | 133 +++++++++++++++++++++++++++++++++= ++++++++- > > 2 files changed, 133 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-reg= s.h > > index a1226e4ab5f8..71662cae9806 100644 > > --- a/drivers/iommu/arm-smmu-regs.h > > +++ b/drivers/iommu/arm-smmu-regs.h > > @@ -177,6 +177,8 @@ enum arm_smmu_s2cr_privcfg { > > #define ARM_SMMU_CB_ATS1PR 0x800 > > #define ARM_SMMU_CB_ATSR 0x8f0 > > > > +#define ARM_SMMU_GID_QCOM_CUSTOM_CFG 0x300 > > + > > #define SCTLR_S1_ASIDPNE (1 << 12) > > #define SCTLR_CFCFG (1 << 7) > > #define SCTLR_CFIE (1 << 6) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > > index 411e5ac57c64..de9c4a5bf686 100644 > > --- a/drivers/iommu/arm-smmu.c > > +++ b/drivers/iommu/arm-smmu.c > > @@ -49,6 +49,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > > > @@ -181,7 +182,8 @@ struct arm_smmu_device { > > #define ARM_SMMU_FEAT_EXIDS (1 << 12) > > u32 features; > > > > -#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0) > > +#define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0) > > +#define ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA (1 << 1) > > u32 options; > > enum arm_smmu_arch_version version; > > enum arm_smmu_implementation model; > > @@ -266,6 +268,7 @@ static bool using_legacy_binding, using_generic_bin= ding; > > > > static struct arm_smmu_option_prop arm_smmu_options[] =3D { > > { ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-acc= ess" }, > > + { ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA, "qcom,smmu-500-fw-impl-errata= " }, > > { 0, NULL}, > > }; > > > > @@ -531,12 +534,134 @@ static void arm_smmu_tlb_inv_vmid_nosync(unsigne= d long iova, size_t size, > > writel_relaxed(smmu_domain->cfg.vmid, base + ARM_SMMU_GR0_TLBIVMI= D); > > } > > > > +#define CUSTOM_CFG_MDP_SAFE_ENABLE BIT(15) > > +#define CUSTOM_CFG_IFE1_SAFE_ENABLE BIT(14) > > +#define CUSTOM_CFG_IFE0_SAFE_ENABLE BIT(13) > > + > > +static int __qsmmu500_wait_safe_toggle(struct arm_smmu_device *smmu, i= nt en) > > +{ > > + int ret; > > + u32 val, gid_phys_base; > > + phys_addr_t reg; > > + struct vm_struct *vm; > > + > > + /* We want physical address of SMMU, so the vm_area */ > > + vm =3D find_vm_area(smmu->base); > > + > > + /* > > + * GID (implementation defined address space) is located at > > + * SMMU_BASE + (2 =C3=97 PAGESIZE). > > + */ > > + gid_phys_base =3D vm->phys_addr + (2 << (smmu)->pgshift); > > + reg =3D gid_phys_base + ARM_SMMU_GID_QCOM_CUSTOM_CFG; > > + > > + ret =3D qcom_scm_io_readl_atomic(reg, &val); > > + if (ret) > > + return ret; > > + > > + if (en) > > + val |=3D CUSTOM_CFG_MDP_SAFE_ENABLE | > > + CUSTOM_CFG_IFE0_SAFE_ENABLE | > > + CUSTOM_CFG_IFE1_SAFE_ENABLE; > > + else > > + val &=3D ~(CUSTOM_CFG_MDP_SAFE_ENABLE | > > + CUSTOM_CFG_IFE0_SAFE_ENABLE | > > + CUSTOM_CFG_IFE1_SAFE_ENABLE); > > + > > + ret =3D qcom_scm_io_writel_atomic(reg, val); > > + > > + return ret; > > +} > > + > > +static int qsmmu500_wait_safe_toggle(struct arm_smmu_device *smmu, > > + int en, bool is_fw_impl) > > +{ > > + if (is_fw_impl) > > + return qcom_scm_qsmmu500_wait_safe_toggle(en); > > + else > > + return __qsmmu500_wait_safe_toggle(smmu, en); > > +} > > + > > +static void qcom_errata_tlb_sync(struct arm_smmu_domain *smmu_domain) > > +{ > > + struct arm_smmu_device *smmu =3D smmu_domain->smmu; > > + void __iomem *base =3D ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx); > > + bool is_fw_impl; > > + u32 val; > > + > > + writel_relaxed(0, base + ARM_SMMU_CB_TLBSYNC); > > + > > + if (!readl_poll_timeout_atomic(base + ARM_SMMU_CB_TLBSTATUS, val, > > + !(val & sTLBGSTATUS_GSACTIVE), 0, = 100)) > > + return; > > + > > + is_fw_impl =3D smmu->options & ARM_SMMU_OPT_QCOM_FW_IMPL_ERRATA ? > > + true : false; > > + > > + /* SCM call here to disable the wait-for-safe logic. */ > > I take it this is a global state, so it can't just be turned off for the > relevant contexts and left that way? Yes this is a global state disable/enable. But can we disable it for the required context altogether forever, I will have to find it out. > > > + if (WARN(qsmmu500_wait_safe_toggle(smmu, false, is_fw_impl), > > + "Failed to disable wait-safe logic, bad hw state\n")) > > + return; > > + > > + if (!readl_poll_timeout_atomic(base + ARM_SMMU_CB_TLBSTATUS, val, > > + !(val & sTLBGSTATUS_GSACTIVE), 0, = 10000)) > > + return; > > + > > + /* SCM call here to re-enable the wait-for-safe logic. */ > > + WARN(qsmmu500_wait_safe_toggle(smmu, true, is_fw_impl), > > + "Failed to re-enable wait-safe logic, bad hw state\n"); > > + > > + dev_err_ratelimited(smmu->dev, > > + "TLB sync timed out -- SMMU in bad state\n"); > > +} > > + > > +static void qcom_errata_tlb_sync_context(void *cookie) > > +{ > > + struct arm_smmu_domain *smmu_domain =3D cookie; > > + unsigned long flags; > > + > > + spin_lock_irqsave(&smmu_domain->cb_lock, flags); > > + qcom_errata_tlb_sync(smmu_domain); > > + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags); > > +} > > + > > +static void qcom_errata_tlb_inv_context_s1(void *cookie) > > +{ > > + struct arm_smmu_domain *smmu_domain =3D cookie; > > + struct arm_smmu_cfg *cfg =3D &smmu_domain->cfg; > > + void __iomem *base =3D ARM_SMMU_CB(smmu_domain->smmu, cfg->cbndx)= ; > > + unsigned long flags; > > + > > + spin_lock_irqsave(&smmu_domain->cb_lock, flags); > > + writel_relaxed(cfg->asid, base + ARM_SMMU_CB_S1_TLBIASID); > > + qcom_errata_tlb_sync(cookie); > > + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags); > > +} > > + > > +static void qcom_errata_tlb_inv_range_nosync(unsigned long iova, size_= t size, > > + size_t granule, bool leaf, > > + void *cookie) > > +{ > > + struct arm_smmu_domain *smmu_domain =3D cookie; > > + unsigned long flags; > > + > > + spin_lock_irqsave(&smmu_domain->cb_lock, flags); > > + arm_smmu_tlb_inv_range_nosync(iova, size, granule, leaf, cookie); > > + spin_unlock_irqrestore(&smmu_domain->cb_lock, flags); > > I don't get what this locking is trying to achieve - the only thing it > protects is one or more writes to TLBIVA{L}, which *must* surely be > "serialised" by the interconnect anyway? > Okay, right. This callback is just a write to TLBIVA{L} registers, so this should be good without any locking. A subsequent .tlb_sync has the locking anyways. > The downstream code doesn't appear to implement .tlb_add_flush at all, > so something smells off. Yes, the downstream pg-table driver never uses tlb_add_flush and tlb_sync to do TLBIs. Rather it relies on tlb_flush_all to flush the entire TLB after the unmap is finished. Moreover, in downstream a global remote spinlock is used to protect the invalidation requests owing to other entities besides kernel handling the TLB operations= . Assuming we don't want to do tlb_flush_all, we still want to serialize the invalidation requests, and then toggle the wait-safe logic too. We would also not want the remote spinlock as we are invalidating based on = ASIDs in the kernel. Regards Vivek > > Robin. > > > +} > > + > > static const struct iommu_gather_ops arm_smmu_s1_tlb_ops =3D { > > .tlb_flush_all =3D arm_smmu_tlb_inv_context_s1, > > .tlb_add_flush =3D arm_smmu_tlb_inv_range_nosync, > > .tlb_sync =3D arm_smmu_tlb_sync_context, > > }; > > > > +static const struct iommu_gather_ops qcom_errata_s1_tlb_ops =3D { > > + .tlb_flush_all =3D qcom_errata_tlb_inv_context_s1, > > + .tlb_add_flush =3D qcom_errata_tlb_inv_range_nosync, > > + .tlb_sync =3D qcom_errata_tlb_sync_context, > > +}; > > + > > static const struct iommu_gather_ops arm_smmu_s2_tlb_ops_v2 =3D { > > .tlb_flush_all =3D arm_smmu_tlb_inv_context_s2, > > .tlb_add_flush =3D arm_smmu_tlb_inv_range_nosync, > > @@ -824,7 +949,11 @@ static int arm_smmu_init_domain_context(struct iom= mu_domain *domain, > > ias =3D min(ias, 32UL); > > oas =3D min(oas, 32UL); > > } > > - smmu_domain->tlb_ops =3D &arm_smmu_s1_tlb_ops; > > + if (of_device_is_compatible(smmu->dev->of_node, > > + "qcom,sdm845-smmu-500")) > > + smmu_domain->tlb_ops =3D &qcom_errata_s1_tlb_ops; > > + else > > + smmu_domain->tlb_ops =3D &arm_smmu_s1_tlb_ops; > > break; > > case ARM_SMMU_DOMAIN_NESTED: > > /* > > > _______________________________________________ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu --=20 QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation