iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: Tomasz Figa <tfiga@chromium.org>
To: Yong Wu <yong.wu@mediatek.com>
Cc: youlin.pei@mediatek.com, anan.sun@mediatek.com,
	Nicolas Boichat <drinkcat@chromium.org>,
	cui.zhang@mediatek.com,
	srv_heupstream <srv_heupstream@mediatek.com>,
	Will Deacon <will.deacon@arm.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Evan Green <evgreen@chromium.org>,
	chao.hao@mediatek.com,
	"list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,
	" <iommu@lists.linux-foundation.org>,
	"moderated list:ARM/Mediatek SoC support"
	<linux-mediatek@lists.infradead.org>,
	Matthias Brugger <matthias.bgg@gmail.com>,
	Robin Murphy <robin.murphy@arm.com>,
	"list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,
	" <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH] iommu/mediatek: Move the tlb_sync into tlb_flush
Date: Thu, 10 Oct 2019 00:08:37 +0900	[thread overview]
Message-ID: <CAAFQd5Aj-DDofzQ1vyKT9OcXpf-Udfx4zeWtVGcQf7S-o+mStQ@mail.gmail.com> (raw)
In-Reply-To: <1570628307.19130.53.camel@mhfsdcap03>

On Wed, Oct 9, 2019 at 10:38 PM Yong Wu <yong.wu@mediatek.com> wrote:
>
> On Wed, 2019-10-09 at 16:56 +0900, Tomasz Figa wrote:
> > On Tue, Oct 8, 2019 at 5:09 PM Yong Wu <yong.wu@mediatek.com> wrote:
> > >
> > > Hi Tomasz,
> > >
> > > Sorry for reply late.
> > >
> > > On Wed, 2019-10-02 at 14:18 +0900, Tomasz Figa wrote:
> > > > Hi Yong,
> > > >
> > > > On Mon, Sep 30, 2019 at 2:42 PM Yong Wu <yong.wu@mediatek.com> wrote:
> > > > >
> > > > > The commit 4d689b619445 ("iommu/io-pgtable-arm-v7s: Convert to IOMMU API
> > > > > TLB sync") help move the tlb_sync of unmap from v7s into the iommu
> > > > > framework. It helps add a new function "mtk_iommu_iotlb_sync", But it
> > > > > lacked the dom->pgtlock, then it will cause the variable
> > > > > "tlb_flush_active" may be changed unexpectedly, we could see this warning
> > > > > log randomly:
> > > > >
> > > >
> > > > Thanks for the patch! Please see my comments inline.
> > > >
> > > > > mtk-iommu 10205000.iommu: Partial TLB flush timed out, falling back to
> > > > > full flush
> > > > >
> > > > > To fix this issue, we can add dom->pgtlock in the "mtk_iommu_iotlb_sync".
> > > > > And when checking this issue, we find that __arm_v7s_unmap call
> > > > > io_pgtable_tlb_add_flush consecutively when it is supersection/largepage,
> > > > > this also is potential unsafe for us. There is no tlb flush queue in the
> > > > > MediaTek M4U HW. The HW always expect the tlb_flush/tlb_sync one by one.
> > > > > If v7s don't always gurarantee the sequence, Thus, In this patch I move
> > > > > the tlb_sync into tlb_flush(also rename the function deleting "_nosync").
> > > > > and we don't care if it is leaf, rearrange the callback functions. Also,
> > > > > the tlb flush/sync was already finished in v7s, then iotlb_sync and
> > > > > iotlb_sync_all is unnecessary.
> > > >
> > > > Performance-wise, we could do much better. Instead of synchronously
> > > > syncing at the end of mtk_iommu_tlb_add_flush(), we could sync at the
> > > > beginning, if there was any previous flush still pending. We would
> > > > also have to keep the .iotlb_sync() callback, to take care of waiting
> > > > for the last flush. That would allow better pipelining with CPU in
> > > > cases like this:
> > > >
> > > > for (all pages in range) {
> > > >    change page table();
> > > >    flush();
> > > > }
> > > >
> > > > "change page table()" could execute while the IOMMU is flushing the
> > > > previous change.
> > >
> > > Do you mean adding a new tlb_sync before tlb_flush_no_sync, like below:
> > >
> > > mtk_iommu_tlb_add_flush_nosync {
> > >    + mtk_iommu_tlb_sync();
> > >    tlb_flush_no_sync();
> > >    data->tlb_flush_active = true;
> > > }
> > >
> > > mtk_iommu_tlb_sync {
> > >         if (!data->tlb_flush_active)
> > >                 return;
> > >         tlb_sync();
> > >         data->tlb_flush_active = false;
> > > }
> > >
> > > This way look improve the flow, But adjusting the flow is not the root
> > > cause of this issue. the problem is "data->tlb_flush_active" may be
> > > changed from mtk_iommu_iotlb_sync which don't have a dom->pglock.
> >
> > That was not the only problem with existing code. Existing code also
> > assumed that add_flush and sync always go in pairs, but that's not
> > true.
>
> Yes. Thus I put the tlb_flush always followed by tlb_sync to make sure
> they always go in pairs.
>
> >
> > My suggestion is to fix the locking in the driver and keep the sync
> > deferred as much as possible, so that performance is not degraded. I
>
> I really didn't get this timeout warning log in previous kernel(Many
> tlb_flush followed by one tlb_sync),

Locking issues typically lead to timing problems (race conditions), so
it might just be that the sequence or timing of calls changed between
kernel versions, enough to trigger the issue.

> But deferring the sync is not
> suggested by our DE, thus I still would like to fix the sequence in this
> patch with putting them together.
>

What's the reason it's not suggested? From my understanding, there
shouldn't be any dependency on hardware design here, it's just a
simple software optimization - we can pipeline other CPU operations
while the IOMMU is flushing the TLB.

Basically, right now:

CPU writes page tables 1
IOMMU flushes 1 | CPU busy waits
CPU writes page tables 2
IOMMU flushes 2 | CPU busy waits
CPU writes page tables 3
IOMMU flushes 3 | CPU busy waits
...

With my suggestion that could be:

CPU writes page tables 1 I
IOMMU flushes 1 | CPU writes page tables 2
IOMMU flushes 1 | CPU busy waits less time
IOMMU flushes 2 | CPU writes page tables 3
IOMMU flushes 2 | CPU busy waits less time
IOMMU flushes 3 | CPU busy waits

It reduces the time the CPU spends on busy waiting rather than doing
something useful. It also reduces the total time of maps and unmaps.

Actually, we can optimize even more. Please consider the case below.

CPU writes PTE for IOVA 0x1000.
IOMMU flushes 0x1000 | CPU busy waits
CPU writes PTE for IOVA 0x2000.
IOMMU flushes 0x2000 | CPU busy waits
CPU writes PTE for IOVA 0x3000.
IOMMU flushes 0x3000 | CPU busy waits
CPU writes PTE for IOVA 0x8000.
IOMMU flushes 0x8000 | CPU busy waits

However, depending on the way the hardware implements TLB flush, the
optimal sequence of operations could be:

CPU writes PTE for IOVA 0x1000.
CPU writes PTE for IOVA 0x2000.
CPU writes PTE for IOVA 0x3000.
IOMMU flushes 0x1000-0x3000 | CPU writes PTE for IOVA 0x8000.
IOMMU flushes 0x1000-0x3000 | CPU busy waits remaining flush time.
IOMMU flushes 0x8000 | CPU busy waits.

What's the algorithmic complexity of the TLB flush operation on
MediaTek IOMMU? If N is the number of pages to flush, is it O(N), O(1)
or something else?

Best regards,
Tomasz
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

      reply	other threads:[~2019-10-09 15:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-30  5:42 [PATCH] iommu/mediatek: Move the tlb_sync into tlb_flush Yong Wu
2019-09-30 12:09 ` Will Deacon
2019-10-08  8:18   ` Yong Wu
2019-10-02  5:18 ` Tomasz Figa
2019-10-02 10:35   ` Robin Murphy
2019-10-08  8:09     ` Yong Wu
2019-10-08  8:09   ` Yong Wu
2019-10-09  7:56     ` Tomasz Figa
2019-10-09 13:38       ` Yong Wu
2019-10-09 15:08         ` Tomasz Figa [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAFQd5Aj-DDofzQ1vyKT9OcXpf-Udfx4zeWtVGcQf7S-o+mStQ@mail.gmail.com \
    --to=tfiga@chromium.org \
    --cc=anan.sun@mediatek.com \
    --cc=chao.hao@mediatek.com \
    --cc=cui.zhang@mediatek.com \
    --cc=drinkcat@chromium.org \
    --cc=evgreen@chromium.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=matthias.bgg@gmail.com \
    --cc=robin.murphy@arm.com \
    --cc=srv_heupstream@mediatek.com \
    --cc=will.deacon@arm.com \
    --cc=yong.wu@mediatek.com \
    --cc=youlin.pei@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).