From: John Garry <john.garry@huawei.com>
To: Robin Murphy <robin.murphy@arm.com>,
"Leizhen (ThunderTown)" <thunder.leizhen@huawei.com>,
Will Deacon <will@kernel.org>, Joerg Roedel <joro@8bytes.org>,
iommu <iommu@lists.linux-foundation.org>,
linux-kernel <linux-kernel@vger.kernel.org>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>, Linuxarm <linuxarm@huawei.com>
Subject: Re: [PATCH 1/1] Revert "iommu/iova: Retry from last rb tree node if iova search fails"
Date: Mon, 8 Mar 2021 16:22:12 +0000 [thread overview]
Message-ID: <c58abbec-7220-b440-98d4-d1026a8feed4@huawei.com> (raw)
In-Reply-To: <eacd5ccd-ab5a-27fe-6542-deaefd597d11@arm.com>
On 08/03/2021 15:15, Robin Murphy wrote:
>> I figure that you're talking about 4e89dce72521 now. I would have
>> liked to know which real-life problem it solved in practice.
>
> From what I remember, the problem reported was basically the one
> illustrated in that commit and the one I alluded to above - namely that
> certain allocation patterns with a broad mix of sizes and relative
> lifetimes end up pushing the cached PFN down to the bottom of the
> address space such that allocations start failing despite there still
> being sufficient free space overall, which was breaking some media
> workload. What was originally proposed was an overcomplicated palaver
> with DMA attributes and a whole extra allocation algorithm rather than
> just fixing the clearly unintended and broken behaviour.
ok, fine. I just wondered if this was a theoretical problem only.
>
>>> While max32_alloc_size indirectly tracks the largest*contiguous*
>>> available space, one of the ideas from which it grew was to simply keep
>>> count of the total number of free PFNs. If you're really spending
>>> significant time determining that the tree is full, as opposed to just
>>> taking longer to eventually succeed, then it might be relatively
>>> innocuous to tack on that semi-redundant extra accounting as a
>>> self-contained quick fix for that worst case.
>>>
...
>>
>> Even if it is were configurable, wouldn't it make sense to have it
>> configurable per IOVA domain?
>
> Perhaps, but I don't see that being at all easy to implement. We can't
> arbitrarily *increase* the scope of caching once a domain is active due
> to the size-rounding-up requirement, which would be prohibitive to
> larger allocations if applied universally.
>
Agreed.
But having that (all IOVAs sizes being cacheable) available could be
really great, though, for some situations.
>> Furthermore, as mentioned above, I still want to solve this IOVA aging
>> issue, and this fixed RCACHE RANGE size seems to be the at the center
>> of that problem.
>>
>>>
>>>> As for 4e89dce72521, so even if it's proper to retry for a failed
>>>> alloc,
>>>> it is not always necessary. I mean, if we're limiting ourselves to 32b
>>>> subspace for this SAC trick and we fail the alloc, then we can try the
>>>> space above 32b first (if usable). If that fails, then retry there. I
>>>> don't see a need to retry the 32b subspace if we're not limited to it.
>>>> How about it? We tried that idea and it looks to just about restore
>>>> performance.
>>> The thing is, if you do have an actual PCI device where DAC might mean a
>>> 33% throughput loss and you're mapping a long-lived buffer, or you're on
>>> one of these systems where firmware fails to document address limits and
>>> using the full IOMMU address width quietly breaks things, then you
>>> almost certainly*do* want the allocator to actually do a proper job of
>>> trying to satisfy the given request.
>>
>> If those conditions were true, then it seems quite a tenuous position,
>> so trying to help that scenario in general terms will have limited
>> efficacy.
>
> Still, I'd be curious to see if making the restart a bit cleverer offers
> a noticeable improvement. IIRC I suggested it at the time, but in the
> end the push was just to get *something* merged.
Sorry to say, I just tested that ("iommu/iova: Improve restart logic")
and there is no obvious improvement.
I'll have a further look at what might be going on.
Thanks very much,
John
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
next prev parent reply other threads:[~2021-03-08 16:24 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-29 9:21 [PATCH 1/1] Revert "iommu/iova: Retry from last rb tree node if iova search fails" Zhen Lei
2021-01-29 9:48 ` Leizhen (ThunderTown)
2021-01-29 12:03 ` Robin Murphy
2021-01-29 12:43 ` chenxiang (M)
2021-02-25 13:54 ` John Garry
2021-03-01 13:20 ` Robin Murphy
2021-03-01 15:48 ` John Garry
2021-03-02 12:30 ` John Garry
2021-03-08 15:15 ` Robin Murphy
2021-03-08 16:22 ` John Garry [this message]
2021-03-10 17:50 ` John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c58abbec-7220-b440-98d4-d1026a8feed4@huawei.com \
--to=john.garry@huawei.com \
--cc=iommu@lists.linux-foundation.org \
--cc=joro@8bytes.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=robin.murphy@arm.com \
--cc=thunder.leizhen@huawei.com \
--cc=vjitta@codeaurora.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).