Re: [PATCH] iommu/arm-smmu-v3: Add SMMUv3.2 range invalidation support

From: Auger Eric <eric.auger@redhat.com>
To: Rob Herring <robh@kernel.org>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Joerg Roedel <joro@8bytes.org>,
	Linux IOMMU <iommu@lists.linux-foundation.org>,
	Will Deacon <will@kernel.org>,
	"moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE"
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH] iommu/arm-smmu-v3: Add SMMUv3.2 range invalidation support
Date: Thu, 16 Jan 2020 22:23:17 +0100	[thread overview]
Message-ID: <4e56aa27-37f0-d8d9-46fd-871055abcb49@redhat.com> (raw)
In-Reply-To: <CAL_Jsq+fwdLfxgW=aoMNySrKunSgtC+i5ttsn1vCdR2p4BMPfA@mail.gmail.com>

Hi Rob,

On 1/16/20 5:57 PM, Rob Herring wrote:
> On Wed, Jan 15, 2020 at 10:33 AM Auger Eric <eric.auger@redhat.com> wrote:
>>
>> Hi Rob,
>>
>> On 1/15/20 3:02 PM, Rob Herring wrote:
>>> On Wed, Jan 15, 2020 at 3:21 AM Auger Eric <eric.auger@redhat.com> wrote:
>>>>
>>>> Hi Rob,
>>>>
>>>> On 1/13/20 3:39 PM, Rob Herring wrote:
>>>>> Arm SMMUv3.2 adds support for TLB range invalidate operations.
>>>>> Support for range invalidate is determined by the RIL bit in the IDR3
>>>>> register.
>>>>>
>>>>> The range invalidate is in units of the leaf page size and operates on
>>>>> 1-32 chunks of a power of 2 multiple pages. First we determine from the
>>>>> size what power of 2 multiple we can use and then adjust the granule to
>>>>> 32x that size.
> 
>>>>> @@ -2022,12 +2043,39 @@ static void arm_smmu_tlb_inv_range(unsigned long iova, size_t size,
>>>>>               cmd.tlbi.vmid   = smmu_domain->s2_cfg.vmid;
>>>>>       }
>>>>>
>>>>> +     if (smmu->features & ARM_SMMU_FEAT_RANGE_INV) {
>>>>> +             unsigned long tg, scale;
>>>>> +
>>>>> +             /* Get the leaf page size */
>>>>> +             tg = __ffs(smmu_domain->domain.pgsize_bitmap);
>>>> it is unclear to me why you can't set tg with the granule parameter.
>>>
>>> granule could be 2MB sections if THP is enabled, right?
>>
>> Ah OK I thought it was a page size and not a block size.
>>
>> I requested this feature a long time ago for virtual SMMUv3. With
>> DPDK/VFIO the guest was sending page TLB invalidation for each page
>> (granule=4K or 64K) part of the hugepage buffer and those were trapped
>> by the VMM. This stalled qemu.
> 
> I did some more testing to make sure THP is enabled, but haven't been
> able to get granule to be anything but 4K. I only have the Fast Model
> with AHCI on PCI to test this with. Maybe I'm hitting some place where
> THPs aren't supported yet.
> 
>>>>> +             /* Determine the power of 2 multiple number of pages */
>>>>> +             scale = __ffs(size / (1UL << tg));
>>>>> +             cmd.tlbi.scale = scale;
>>>>> +
>>>>> +             cmd.tlbi.num = CMDQ_TLBI_RANGE_NUM_MAX - 1;
>>>> Also could you explain why you use CMDQ_TLBI_RANGE_NUM_MAX.
>>>
>>> How's this:
>>> /* The invalidation loop defaults to the maximum range */
>> I would have expected num=0 directly. Don't we invalidate the &size in
>> one shot as 2^scale * pages of granularity @tg? I fail to understand
>> when NUM > 0.
> 
> NUM is > 0 anytime size is not a power of 2. For example, if size is
> 33 pages, then it takes 2 loops doing 32 pages and then 1 page. If
> size is 34 pages, then NUM is (17-1) and SCALE is 1.
OK I get it now. I misread the scale computation as log2() :-(.

I still have a doubt about the scale choice. What if you invalidate a
large number of pages such as 1025 pages. scale is 0 and you end up with
32 * 32 * 2^0 + 1 * 2 * 2^0  invalidations (33). Whereas you could
invalidate the whole range with 2 invalidation commands: 1 x 2^10 +
1*1^1 (packing the invalidations by largest scale). Am I correct or do I
still miss something?

Besides in the patch I think in the while loop the iova should be
incremented with the actual number of invalidated bytes and not the max
sized granule variable.

Thanks

Eric

> 
> Rob
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel