All of lore.kernel.org
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Cc: Will Deacon <will@kernel.org>, Joerg Roedel <joro@8bytes.org>,
	Thierry Reding <treding@nvidia.com>,
	linux-arm-msm@vger.kernel.org,
	Douglas Anderson <dianders@chromium.org>,
	linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCHv2 1/3] iommu/io-pgtable: Add a quirk to use tlb_flush_all() for partial walk flush
Date: Tue, 22 Jun 2021 13:11:12 +0100	[thread overview]
Message-ID: <2b093b93-7fab-be35-59d8-4463c199719a@arm.com> (raw)
In-Reply-To: <a110e58e36af207be2bed04d1331832a@codeaurora.org>

On 2021-06-22 08:11, Sai Prakash Ranjan wrote:
> Hi Robin,
> 
> On 2021-06-21 21:15, Robin Murphy wrote:
>> On 2021-06-18 03:51, Sai Prakash Ranjan wrote:
>>> Add a quirk IO_PGTABLE_QUIRK_TLB_INV_ALL to invalidate entire context
>>> with tlb_flush_all() callback in partial walk flush to improve unmap
>>> performance on select few platforms where the cost of over-invalidation
>>> is less than the unmap latency.
>>
>> I still think this doesn't belong anywhere near io-pgtable at all.
>> It's a driver-internal decision how exactly it implements a non-leaf
>> invalidation, and that may be more complex than a predetermined
>> boolean decision. For example, I've just realised for SMMUv3 we can't
>> invalidate multiple levels of table at once with a range command,
>> since if we assume the whole thing is mapped at worst-case page
>> granularity we may fail to invalidate any parts which are mapped as
>> intermediate-level blocks. If invalidating a 1GB region (with 4KB
>> granule) means having to fall back to 256K non-range commands, we may
>> not want to invalidate by VA then, even though doing so for a 2MB
>> region is still optimal.
>>
>> It's also quite feasible that drivers might want to do this for leaf
>> invalidations too - if you don't like issuing 512 commands to
>> invalidate 2MB, do you like issuing 511 commands to invalidate 2044KB?
>> - and at that point the logic really has to be in the driver anyway.
>>
> 
> Ok I will move this to tlb_flush_walk() functions in the drivers. In the 
> previous
> v1 thread, you suggested to make the choice in iommu_get_dma_strict() test,
> I assume you meant the test in iommu_dma_init_domain() with a flag or 
> was it
> the leaf driver(ex:arm-smmu.c) test of iommu_get_dma_strict() in 
> init_domain?

Yes, I meant literally inside the same condition where we currently set 
"pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;" in 
arm_smmu_init_domain_context().

> I am still a bit confused on where this flag would be? Should this be a 
> part
> of struct iommu_domain?

Well, if you were to rewrite the config with an alternative set of 
flush_ops at that point it would be implicit. For a flag, probably 
either in arm_smmu_domain or arm_smmu_impl. Maybe a flag would be less 
useful than generalising straight to a "maximum number of by-VA 
invalidations it's worth sending individually" threshold value? It's 
clear to me what overall shape and separation of responsibility is most 
logical, but beyond that I don't have a particularly strong opinion on 
the exact implementation; I've just been chucking ideas around :)

Cheers,
Robin.

WARNING: multiple messages have this Message-ID (diff)
From: Robin Murphy <robin.murphy@arm.com>
To: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Cc: linux-arm-msm@vger.kernel.org,
	Douglas Anderson <dianders@chromium.org>,
	linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	Thierry Reding <treding@nvidia.com>,
	Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCHv2 1/3] iommu/io-pgtable: Add a quirk to use tlb_flush_all() for partial walk flush
Date: Tue, 22 Jun 2021 13:11:12 +0100	[thread overview]
Message-ID: <2b093b93-7fab-be35-59d8-4463c199719a@arm.com> (raw)
In-Reply-To: <a110e58e36af207be2bed04d1331832a@codeaurora.org>

On 2021-06-22 08:11, Sai Prakash Ranjan wrote:
> Hi Robin,
> 
> On 2021-06-21 21:15, Robin Murphy wrote:
>> On 2021-06-18 03:51, Sai Prakash Ranjan wrote:
>>> Add a quirk IO_PGTABLE_QUIRK_TLB_INV_ALL to invalidate entire context
>>> with tlb_flush_all() callback in partial walk flush to improve unmap
>>> performance on select few platforms where the cost of over-invalidation
>>> is less than the unmap latency.
>>
>> I still think this doesn't belong anywhere near io-pgtable at all.
>> It's a driver-internal decision how exactly it implements a non-leaf
>> invalidation, and that may be more complex than a predetermined
>> boolean decision. For example, I've just realised for SMMUv3 we can't
>> invalidate multiple levels of table at once with a range command,
>> since if we assume the whole thing is mapped at worst-case page
>> granularity we may fail to invalidate any parts which are mapped as
>> intermediate-level blocks. If invalidating a 1GB region (with 4KB
>> granule) means having to fall back to 256K non-range commands, we may
>> not want to invalidate by VA then, even though doing so for a 2MB
>> region is still optimal.
>>
>> It's also quite feasible that drivers might want to do this for leaf
>> invalidations too - if you don't like issuing 512 commands to
>> invalidate 2MB, do you like issuing 511 commands to invalidate 2044KB?
>> - and at that point the logic really has to be in the driver anyway.
>>
> 
> Ok I will move this to tlb_flush_walk() functions in the drivers. In the 
> previous
> v1 thread, you suggested to make the choice in iommu_get_dma_strict() test,
> I assume you meant the test in iommu_dma_init_domain() with a flag or 
> was it
> the leaf driver(ex:arm-smmu.c) test of iommu_get_dma_strict() in 
> init_domain?

Yes, I meant literally inside the same condition where we currently set 
"pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;" in 
arm_smmu_init_domain_context().

> I am still a bit confused on where this flag would be? Should this be a 
> part
> of struct iommu_domain?

Well, if you were to rewrite the config with an alternative set of 
flush_ops at that point it would be implicit. For a flag, probably 
either in arm_smmu_domain or arm_smmu_impl. Maybe a flag would be less 
useful than generalising straight to a "maximum number of by-VA 
invalidations it's worth sending individually" threshold value? It's 
clear to me what overall shape and separation of responsibility is most 
logical, but beyond that I don't have a particularly strong opinion on 
the exact implementation; I've just been chucking ideas around :)

Cheers,
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

WARNING: multiple messages have this Message-ID (diff)
From: Robin Murphy <robin.murphy@arm.com>
To: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Cc: Will Deacon <will@kernel.org>, Joerg Roedel <joro@8bytes.org>,
	Thierry Reding <treding@nvidia.com>,
	linux-arm-msm@vger.kernel.org,
	Douglas Anderson <dianders@chromium.org>,
	linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCHv2 1/3] iommu/io-pgtable: Add a quirk to use tlb_flush_all() for partial walk flush
Date: Tue, 22 Jun 2021 13:11:12 +0100	[thread overview]
Message-ID: <2b093b93-7fab-be35-59d8-4463c199719a@arm.com> (raw)
In-Reply-To: <a110e58e36af207be2bed04d1331832a@codeaurora.org>

On 2021-06-22 08:11, Sai Prakash Ranjan wrote:
> Hi Robin,
> 
> On 2021-06-21 21:15, Robin Murphy wrote:
>> On 2021-06-18 03:51, Sai Prakash Ranjan wrote:
>>> Add a quirk IO_PGTABLE_QUIRK_TLB_INV_ALL to invalidate entire context
>>> with tlb_flush_all() callback in partial walk flush to improve unmap
>>> performance on select few platforms where the cost of over-invalidation
>>> is less than the unmap latency.
>>
>> I still think this doesn't belong anywhere near io-pgtable at all.
>> It's a driver-internal decision how exactly it implements a non-leaf
>> invalidation, and that may be more complex than a predetermined
>> boolean decision. For example, I've just realised for SMMUv3 we can't
>> invalidate multiple levels of table at once with a range command,
>> since if we assume the whole thing is mapped at worst-case page
>> granularity we may fail to invalidate any parts which are mapped as
>> intermediate-level blocks. If invalidating a 1GB region (with 4KB
>> granule) means having to fall back to 256K non-range commands, we may
>> not want to invalidate by VA then, even though doing so for a 2MB
>> region is still optimal.
>>
>> It's also quite feasible that drivers might want to do this for leaf
>> invalidations too - if you don't like issuing 512 commands to
>> invalidate 2MB, do you like issuing 511 commands to invalidate 2044KB?
>> - and at that point the logic really has to be in the driver anyway.
>>
> 
> Ok I will move this to tlb_flush_walk() functions in the drivers. In the 
> previous
> v1 thread, you suggested to make the choice in iommu_get_dma_strict() test,
> I assume you meant the test in iommu_dma_init_domain() with a flag or 
> was it
> the leaf driver(ex:arm-smmu.c) test of iommu_get_dma_strict() in 
> init_domain?

Yes, I meant literally inside the same condition where we currently set 
"pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT;" in 
arm_smmu_init_domain_context().

> I am still a bit confused on where this flag would be? Should this be a 
> part
> of struct iommu_domain?

Well, if you were to rewrite the config with an alternative set of 
flush_ops at that point it would be implicit. For a flag, probably 
either in arm_smmu_domain or arm_smmu_impl. Maybe a flag would be less 
useful than generalising straight to a "maximum number of by-VA 
invalidations it's worth sending individually" threshold value? It's 
clear to me what overall shape and separation of responsibility is most 
logical, but beyond that I don't have a particularly strong opinion on 
the exact implementation; I've just been chucking ideas around :)

Cheers,
Robin.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-06-22 12:11 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-18  2:51 [PATCHv2 0/3] iommu/io-pgtable: Optimize partial walk flush for large scatter-gather list Sai Prakash Ranjan
2021-06-18  2:51 ` Sai Prakash Ranjan
2021-06-18  2:51 ` [PATCHv2 1/3] iommu/io-pgtable: Add a quirk to use tlb_flush_all() for partial walk flush Sai Prakash Ranjan
2021-06-18  2:51   ` Sai Prakash Ranjan
2021-06-21 15:45   ` Robin Murphy
2021-06-21 15:45     ` Robin Murphy
2021-06-21 15:45     ` Robin Murphy
2021-06-22  7:11     ` Sai Prakash Ranjan
2021-06-22  7:11       ` Sai Prakash Ranjan
2021-06-22 12:11       ` Robin Murphy [this message]
2021-06-22 12:11         ` Robin Murphy
2021-06-22 12:11         ` Robin Murphy
2021-06-22 14:27         ` Sai Prakash Ranjan
2021-06-22 14:27           ` Sai Prakash Ranjan
2021-06-22 18:37           ` Robin Murphy
2021-06-22 18:37             ` Robin Murphy
2021-06-22 18:37             ` Robin Murphy
2021-06-23 13:43             ` Sai Prakash Ranjan
2021-06-23 13:43               ` Sai Prakash Ranjan
2021-06-18  2:51 ` [PATCHv2 2/3] iommu/io-pgtable: Optimize partial walk flush for large scatter-gather list Sai Prakash Ranjan
2021-06-18  2:51   ` Sai Prakash Ranjan
2021-06-18 22:09   ` Doug Anderson
2021-06-18 22:09     ` Doug Anderson
2021-06-18 22:09     ` Doug Anderson
2021-06-21  5:47     ` Sai Prakash Ranjan
2021-06-21  5:47       ` Sai Prakash Ranjan
2021-06-21 16:30       ` Robin Murphy
2021-06-21 16:30         ` Robin Murphy
2021-06-21 16:30         ` Robin Murphy
2021-06-18  2:51 ` [PATCHv2 3/3] iommu/arm-smmu-qcom: Set IO_PGTABLE_QUIRK_TLB_INV_ALL for QTI SoC impl Sai Prakash Ranjan
2021-06-18  2:51   ` Sai Prakash Ranjan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2b093b93-7fab-be35-59d8-4463c199719a@arm.com \
    --to=robin.murphy@arm.com \
    --cc=dianders@chromium.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=saiprakash.ranjan@codeaurora.org \
    --cc=treding@nvidia.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.