iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: John Garry <john.garry@huawei.com>
To: Robin Murphy <robin.murphy@arm.com>,
	"Leizhen (ThunderTown)" <thunder.leizhen@huawei.com>,
	Will Deacon <will@kernel.org>, Joerg Roedel <joro@8bytes.org>,
	iommu <iommu@lists.linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>, Linuxarm <linuxarm@huawei.com>
Subject: Re: [PATCH 1/1] Revert "iommu/iova: Retry from last rb tree node if iova search fails"
Date: Mon, 8 Mar 2021 16:22:12 +0000	[thread overview]
Message-ID: <c58abbec-7220-b440-98d4-d1026a8feed4@huawei.com> (raw)
In-Reply-To: <eacd5ccd-ab5a-27fe-6542-deaefd597d11@arm.com>

On 08/03/2021 15:15, Robin Murphy wrote:
>> I figure that you're talking about 4e89dce72521 now. I would have 
>> liked to know which real-life problem it solved in practice.
> 
>  From what I remember, the problem reported was basically the one 
> illustrated in that commit and the one I alluded to above - namely that 
> certain allocation patterns with a broad mix of sizes and relative 
> lifetimes end up pushing the cached PFN down to the bottom of the 
> address space such that allocations start failing despite there still 
> being sufficient free space overall, which was breaking some media 
> workload. What was originally proposed was an overcomplicated palaver 
> with DMA attributes and a whole extra allocation algorithm rather than 
> just fixing the clearly unintended and broken behaviour.

ok, fine. I just wondered if this was a theoretical problem only.

> 
>>> While max32_alloc_size indirectly tracks the largest*contiguous* 
>>> available space, one of the ideas from which it grew was to simply keep
>>> count of the total number of free PFNs. If you're really spending
>>> significant time determining that the tree is full, as opposed to just
>>> taking longer to eventually succeed, then it might be relatively
>>> innocuous to tack on that semi-redundant extra accounting as a
>>> self-contained quick fix for that worst case.
>>>

...

>>
>> Even if it is were configurable, wouldn't it make sense to have it 
>> configurable per IOVA domain?
> 
> Perhaps, but I don't see that being at all easy to implement. We can't 
> arbitrarily *increase* the scope of caching once a domain is active due 
> to the size-rounding-up requirement, which would be prohibitive to 
> larger allocations if applied universally.
> 

Agreed.

But having that (all IOVAs sizes being cacheable) available could be 
really great, though, for some situations.

>> Furthermore, as mentioned above, I still want to solve this IOVA aging 
>> issue, and this fixed RCACHE RANGE size seems to be the at the center 
>> of that problem.
>>
>>>
>>>> As for 4e89dce72521, so even if it's proper to retry for a failed 
>>>> alloc,
>>>> it is not always necessary. I mean, if we're limiting ourselves to 32b
>>>> subspace for this SAC trick and we fail the alloc, then we can try the
>>>> space above 32b first (if usable). If that fails, then retry there. I
>>>> don't see a need to retry the 32b subspace if we're not limited to it.
>>>> How about it? We tried that idea and it looks to just about restore
>>>> performance.
>>> The thing is, if you do have an actual PCI device where DAC might mean a
>>> 33% throughput loss and you're mapping a long-lived buffer, or you're on
>>> one of these systems where firmware fails to document address limits and
>>> using the full IOMMU address width quietly breaks things, then you
>>> almost certainly*do*  want the allocator to actually do a proper job of
>>> trying to satisfy the given request.
>>
>> If those conditions were true, then it seems quite a tenuous position, 
>> so trying to help that scenario in general terms will have limited 
>> efficacy.
> 
> Still, I'd be curious to see if making the restart a bit cleverer offers 
> a noticeable improvement. IIRC I suggested it at the time, but in the 
> end the push was just to get *something* merged.

Sorry to say, I just tested that ("iommu/iova: Improve restart logic") 
and there is no obvious improvement.

I'll have a further look at what might be going on.

Thanks very much,
John
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2021-03-08 16:24 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-29  9:21 [PATCH 1/1] Revert "iommu/iova: Retry from last rb tree node if iova search fails" Zhen Lei
2021-01-29  9:48 ` Leizhen (ThunderTown)
2021-01-29 12:03   ` Robin Murphy
2021-01-29 12:43     ` chenxiang (M)
2021-02-25 13:54     ` John Garry
2021-03-01 13:20       ` Robin Murphy
2021-03-01 15:48         ` John Garry
2021-03-02 12:30           ` John Garry
2021-03-08 15:15           ` Robin Murphy
2021-03-08 16:22             ` John Garry [this message]
2021-03-10 17:50               ` John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c58abbec-7220-b440-98d4-d1026a8feed4@huawei.com \
    --to=john.garry@huawei.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=robin.murphy@arm.com \
    --cc=thunder.leizhen@huawei.com \
    --cc=vjitta@codeaurora.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).