From: Mostafa Saleh <smostafa@google.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: iommu@lists.linux.dev, Joerg Roedel <joro@8bytes.org>,
linux-arm-kernel@lists.infradead.org,
Robin Murphy <robin.murphy@arm.com>,
Will Deacon <will@kernel.org>, Eric Auger <eric.auger@redhat.com>,
Moritz Fischer <mdf@kernel.org>,
Moritz Fischer <moritzf@google.com>,
Michael Shavit <mshavit@google.com>,
Nicolin Chen <nicolinc@nvidia.com>,
patches@lists.linux.dev,
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Subject: Re: [PATCH v7 5/9] iommu/arm-smmu-v3: Make arm_smmu_alloc_cd_ptr()
Date: Fri, 19 Apr 2024 21:14:21 +0000 [thread overview]
Message-ID: <ZiLerRK1pXvt-HML@google.com> (raw)
In-Reply-To: <5-v7-cb149db3a320+3b5-smmuv3_newapi_p2_jgg@nvidia.com>
Hi Jason,
On Tue, Apr 16, 2024 at 04:28:16PM -0300, Jason Gunthorpe wrote:
> Only the attach callers can perform an allocation for the CD table entry,
> the other callers must not do so, they do not have the correct locking and
> they cannot sleep. Split up the functions so this is clear.
>
> arm_smmu_get_cd_ptr() will return pointer to a CD table entry without
> doing any kind of allocation.
>
> arm_smmu_alloc_cd_ptr() will allocate the table and any required
> leaf.
>
> A following patch will add lockdep assertions to arm_smmu_alloc_cd_ptr()
> once the restructuring is completed and arm_smmu_alloc_cd_ptr() is never
> called in the wrong context.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 61 +++++++++++++--------
> 1 file changed, 39 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> index f3df1ec8d258dc..a0d1237272936f 100644
> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> @@ -98,6 +98,7 @@ static struct arm_smmu_option_prop arm_smmu_options[] = {
>
> static int arm_smmu_domain_finalise(struct arm_smmu_domain *smmu_domain,
> struct arm_smmu_device *smmu);
> +static int arm_smmu_alloc_cd_tables(struct arm_smmu_master *master);
>
> static void parse_driver_options(struct arm_smmu_device *smmu)
> {
> @@ -1207,29 +1208,51 @@ static void arm_smmu_write_cd_l1_desc(__le64 *dst,
> struct arm_smmu_cd *arm_smmu_get_cd_ptr(struct arm_smmu_master *master,
> u32 ssid)
> {
> - __le64 *l1ptr;
> - unsigned int idx;
> struct arm_smmu_l1_ctx_desc *l1_desc;
> - struct arm_smmu_device *smmu = master->smmu;
> struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
>
> + if (!cd_table->cdtab)
> + return NULL;
> +
> if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_LINEAR)
> return (struct arm_smmu_cd *)(cd_table->cdtab +
> ssid * CTXDESC_CD_DWORDS);
>
> - idx = ssid >> CTXDESC_SPLIT;
> - l1_desc = &cd_table->l1_desc[idx];
> - if (!l1_desc->l2ptr) {
> - if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
> - return NULL;
> + l1_desc = &cd_table->l1_desc[ssid / CTXDESC_L2_ENTRIES];
These operations used to be shift and bit masking which made sense as it does
what hardware does, is there any reason you changed it to division and modulo?
I checked the disassembly and gcc does the right thing as constants are power
of 2, but I am just curious.
> + if (!l1_desc->l2ptr)
> + return NULL;
> + return &l1_desc->l2ptr[ssid % CTXDESC_L2_ENTRIES];
> +}
>
> - l1ptr = cd_table->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
> - arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
> - /* An invalid L1CD can be cached */
> - arm_smmu_sync_cd(master, ssid, false);
> +static struct arm_smmu_cd *arm_smmu_alloc_cd_ptr(struct arm_smmu_master *master,
> + u32 ssid)
> +{
> + struct arm_smmu_ctx_desc_cfg *cd_table = &master->cd_table;
> + struct arm_smmu_device *smmu = master->smmu;
> +
> + if (!cd_table->cdtab) {
> + if (arm_smmu_alloc_cd_tables(master))
> + return NULL;
> }
> - idx = ssid & (CTXDESC_L2_ENTRIES - 1);
> - return &l1_desc->l2ptr[idx];
> +
> + if (cd_table->s1fmt == STRTAB_STE_0_S1FMT_64K_L2) {
> + unsigned int idx = ssid >> CTXDESC_SPLIT;
Ok, now it’s a shift, I think we should be consistent with how we
calculate the index.
> + struct arm_smmu_l1_ctx_desc *l1_desc;
> +
> + l1_desc = &cd_table->l1_desc[idx];
> + if (!l1_desc->l2ptr) {
> + __le64 *l1ptr;
> +
> + if (arm_smmu_alloc_cd_leaf_table(smmu, l1_desc))
> + return NULL;
> +
> + l1ptr = cd_table->cdtab + idx * CTXDESC_L1_DESC_DWORDS;
> + arm_smmu_write_cd_l1_desc(l1ptr, l1_desc);
> + /* An invalid L1CD can be cached */
> + arm_smmu_sync_cd(master, ssid, false);
> + }
> + }
> + return arm_smmu_get_cd_ptr(master, ssid);
> }
>
> struct arm_smmu_cd_writer {
> @@ -1357,7 +1380,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_master *master, int ssid,
> if (WARN_ON(ssid >= (1 << cd_table->s1cdmax)))
> return -E2BIG;
>
> - cd_table_entry = arm_smmu_get_cd_ptr(master, ssid);
> + cd_table_entry = arm_smmu_alloc_cd_ptr(master, ssid);
The only path allocates the main table is “arm_smmu_attach_dev”, I guess
it would be more robust to leave that as is and have 2 versions of get_cd,
one that allocates leaf and one that is not allocating, what do you think?
Thanks,
Mostafa
> if (!cd_table_entry)
> return -ENOMEM;
>
> @@ -2687,13 +2710,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> struct arm_smmu_cd target_cd;
> struct arm_smmu_cd *cdptr;
>
> - if (!master->cd_table.cdtab) {
> - ret = arm_smmu_alloc_cd_tables(master);
> - if (ret)
> - goto out_list_del;
> - }
> -
> - cdptr = arm_smmu_get_cd_ptr(master, IOMMU_NO_PASID);
> + cdptr = arm_smmu_alloc_cd_ptr(master, IOMMU_NO_PASID);
> if (!cdptr) {
> ret = -ENOMEM;
> goto out_list_del;
> --
> 2.43.2
>
next prev parent reply other threads:[~2024-04-19 21:14 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-16 19:28 [PATCH v7 0/9] Make the SMMUv3 CD logic match the new STE design (part 2a/3) Jason Gunthorpe
2024-04-16 19:28 ` [PATCH v7 1/9] iommu/arm-smmu-v3: Add an ops indirection to the STE code Jason Gunthorpe
2024-04-16 20:18 ` Nicolin Chen
2024-04-19 21:02 ` Mostafa Saleh
2024-04-22 13:09 ` Jason Gunthorpe
2024-04-16 19:28 ` [PATCH v7 2/9] iommu/arm-smmu-v3: Make CD programming use arm_smmu_write_entry() Jason Gunthorpe
2024-04-16 20:48 ` Nicolin Chen
2024-04-18 13:01 ` Robin Murphy
2024-04-18 16:08 ` Jason Gunthorpe
2024-04-19 21:07 ` Mostafa Saleh
2024-04-22 13:29 ` Jason Gunthorpe
2024-04-27 22:08 ` Mostafa Saleh
2024-04-29 14:29 ` Jason Gunthorpe
2024-04-29 15:30 ` Mostafa Saleh
2024-04-16 19:28 ` [PATCH v7 3/9] iommu/arm-smmu-v3: Move the CD generation for S1 domains into a function Jason Gunthorpe
2024-04-16 21:22 ` Nicolin Chen
2024-04-19 21:10 ` Mostafa Saleh
2024-04-22 13:52 ` Jason Gunthorpe
2024-04-16 19:28 ` [PATCH v7 4/9] iommu/arm-smmu-v3: Consolidate clearing a CD table entry Jason Gunthorpe
2024-04-16 19:28 ` [PATCH v7 5/9] iommu/arm-smmu-v3: Make arm_smmu_alloc_cd_ptr() Jason Gunthorpe
2024-04-16 22:19 ` Nicolin Chen
2024-04-19 21:14 ` Mostafa Saleh [this message]
2024-04-22 14:20 ` Jason Gunthorpe
2024-04-27 22:19 ` Mostafa Saleh
2024-04-29 14:01 ` Jason Gunthorpe
2024-04-29 14:47 ` Mostafa Saleh
2024-04-29 14:55 ` Jason Gunthorpe
2024-04-16 19:28 ` [PATCH v7 6/9] iommu/arm-smmu-v3: Allocate the CD table entry in advance Jason Gunthorpe
2024-04-16 19:28 ` [PATCH v7 7/9] iommu/arm-smmu-v3: Move the CD generation for SVA into a function Jason Gunthorpe
2024-04-17 7:37 ` Nicolin Chen
2024-04-17 13:17 ` Jason Gunthorpe
2024-04-17 16:25 ` Nicolin Chen
2024-04-17 16:26 ` Nicolin Chen
2024-04-18 4:40 ` Michael Shavit
2024-04-18 14:28 ` Jason Gunthorpe
2024-04-16 19:28 ` [PATCH v7 8/9] iommu/arm-smmu-v3: Build the whole CD in arm_smmu_make_s1_cd() Jason Gunthorpe
2024-04-17 7:43 ` Nicolin Chen
2024-04-16 19:28 ` [PATCH v7 9/9] iommu/arm-smmu-v3: Add unit tests for arm_smmu_write_entry Jason Gunthorpe
2024-04-17 8:09 ` Nicolin Chen
2024-04-17 14:16 ` Jason Gunthorpe
2024-04-17 16:13 ` Nicolin Chen
2024-04-18 4:39 ` Michael Shavit
2024-04-18 12:48 ` Jason Gunthorpe
2024-04-18 14:34 ` Michael Shavit
2024-04-19 21:24 ` Mostafa Saleh
2024-04-22 14:24 ` Jason Gunthorpe
2024-04-27 22:33 ` Mostafa Saleh
2024-04-16 19:40 ` [PATCH v7 0/9] Make the SMMUv3 CD logic match the new STE design (part 2a/3) Nicolin Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZiLerRK1pXvt-HML@google.com \
--to=smostafa@google.com \
--cc=eric.auger@redhat.com \
--cc=iommu@lists.linux.dev \
--cc=jgg@nvidia.com \
--cc=joro@8bytes.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mdf@kernel.org \
--cc=moritzf@google.com \
--cc=mshavit@google.com \
--cc=nicolinc@nvidia.com \
--cc=patches@lists.linux.dev \
--cc=robin.murphy@arm.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).