LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Steven Price <steven.price@arm.com>
To: Robin Murphy <robin.murphy@arm.com>,
	Rob Herring <robh@kernel.org>,
	dri-devel@lists.freedesktop.org
Cc: Neil Armstrong <narmstrong@baylibre.com>,
	Maxime Ripard <maxime.ripard@bootlin.com>,
	Will Deacon <will.deacon@arm.com>,
	linux-kernel@vger.kernel.org, David Airlie <airlied@linux.ie>,
	iommu@lists.linux-foundation.org,
	linux-arm-kernel@lists.infradead.org, Sean Paul <sean@poorly.run>,
	Alyssa Rosenzweig <alyssa@rosenzweig.io>
Subject: Re: [PATCH v3 1/3] iommu: io-pgtable: Add ARM Mali midgard MMU page table format
Date: Fri, 5 Apr 2019 11:36:55 +0100
Message-ID: <0c5f474e-6dd4-8ba5-8c2b-52a99e9b1ca5@arm.com> (raw)
In-Reply-To: <a5a11a56-46dc-ab44-564b-d2216f32451e@arm.com>

On 05/04/2019 10:51, Robin Murphy wrote:
> Hi Steve,
> 
> On 05/04/2019 10:42, Steven Price wrote:
>> First let me say congratulations to everyone working on Panfrost - it's
>> an impressive achievement!
>>
>> Full disclosure: I used to work on the Mali kbase driver. And have been
>> playing around with running the Mali user-space blob with the Panfrost
>> kernel driver.
>>
>> On 01/04/2019 08:47, Rob Herring wrote:
>>> ARM Mali midgard GPU is similar to standard 64-bit stage 1 page
>>> tables, but
>>> have a few differences. Add a new format type to represent the
>>> format. The
>>> input address size is 48-bits and the output address size is 40-bits
>>> (and
>>> possibly less?). Note that the later bifrost GPUs follow the standard
>>> 64-bit stage 1 format.
>>>
>>> The differences in the format compared to 64-bit stage 1 format are:
>>>
>>> The 3rd level page entry bits are 0x1 instead of 0x3 for page entries.
>>>
>>> The access flags are not read-only and unprivileged, but read and write.
>>> This is similar to stage 2 entries, but the memory attributes field
>>> matches
>>> stage 1 being an index.
>>>
>>> The nG bit is not set by the vendor driver. This one didn't seem to
>>> matter,
>>> but we'll keep it aligned to the vendor driver.
>>
>> The nG bit should be ignored by the hardware.
>>
>> The MMU in Midgard/Bifrost has a quirk similar to
>> IO_PGTABLE_QUIRK_TLBI_ON_MAP - you must perform a cache flush for the
>> GPU to (reliably) pick up new page table mappings.
>>
>> You may not have seen this because of the use of the JS_CONFIG_START_MMU
>> bit - this effectively performs a cache flush and TLB invalidate before
>> starting a job, however when using a GPU like T760 (e.g. on the Firefly
>> RK3288) this bit isn't being set. In my testing on the Firefly board I
>> saw GPU page faults because of this.
>>
>> There's two options for fixing this - a patch like below adds the quirk
>> mode to the MMU. Or alternatively always set JS_CONFIG_START_MMU on
>> jobs. In my testing both options solve the page faults.
>>
>> To be honest I don't know the reasoning behind kbase making the
>> JS_CONFIG_START_MMU bit conditional - I'm not aware of any reason why it
>> can't always be set. My guess is performance, but I haven't benchmarked
>> the difference between this and JS_CONFIG_START_MMU.
>>
>> -----8<----------
>>  From e3f75c7f04e43238dfc579029b8c11fb6b4a0c18 Mon Sep 17 00:00:00 2001
>> From: Steven Price <steven.price@arm.com>
>> Date: Thu, 4 Apr 2019 15:53:17 +0100
>> Subject: [PATCH] iommu: io-pgtable: IO_PGTABLE_QUIRK_TLBI_ON_MAP for LPAE
>>
>> Midgard/Bifrost GPUs require a TLB invalidation when mapping pages,
>> implement IO_PGTABLE_QUIRK_TLBI_ON_MAP for LPAE iommu page table
>> formats and add the quirk bit to Panfrost.
>>
>> Signed-off-by: Steven Price <steven.price@arm.com>
>> ---
>>   drivers/gpu/drm/panfrost/panfrost_mmu.c |  1 +
>>   drivers/iommu/io-pgtable-arm.c          | 11 +++++++++--
>>   2 files changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c
>> b/drivers/gpu/drm/panfrost/panfrost_mmu.c
>> index f3aad8591cf4..094312074d66 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
>> @@ -343,6 +343,7 @@ int panfrost_mmu_init(struct panfrost_device *pfdev)
>>       mmu_write(pfdev, MMU_INT_MASK, ~0);
>>
>>       pfdev->mmu->pgtbl_cfg = (struct io_pgtable_cfg) {
>> +        .quirks        = IO_PGTABLE_QUIRK_TLBI_ON_MAP,
>>           .pgsize_bitmap    = SZ_4K, // | SZ_2M | SZ_1G),
>>           .ias        = 48,
>>           .oas        = 40,    /* Should come from dma mask? */
>> diff --git a/drivers/iommu/io-pgtable-arm.c
>> b/drivers/iommu/io-pgtable-arm.c
>> index 84beea1f47a7..45fd7bbdf9aa 100644
>> --- a/drivers/iommu/io-pgtable-arm.c
>> +++ b/drivers/iommu/io-pgtable-arm.c
>> @@ -505,7 +505,13 @@ static int arm_lpae_map(struct io_pgtable_ops *ops,
>> unsigned long iova,
>>        * Synchronise all PTE updates for the new mapping before there's
>>        * a chance for anything to kick off a table walk for the new iova.
>>        */
>> -    wmb();
>> +    if (data->iop.cfg.quirks & IO_PGTABLE_QUIRK_TLBI_ON_MAP) {
>> +        io_pgtable_tlb_add_flush(&data->iop, iova, size,
>> +                     ARM_LPAE_BLOCK_SIZE(2, data), false);
> 
> For correctness (in case this ever ends up used for something with
> VMSA-like invalidation behaviour), the granule would need to be "size"
> here, rather than effectively hard-coded.

Ah yes - I did rather just copy/paste this from io-pgtable-arm-v7s with
minor fix-ups.

> However, since Mali's invalidations appear to operate on arbitrary
> ranges, it would probably be a lot more efficient for the driver to
> handle this case directly, by just issuing a single big invalidation
> after the for_each_sg() loop in panfrost_mmu_map().

Yes - that would probably be a better option. Although I think
personally I'd lean towards just using JS_CONFIG_START_MMU for most
cases. The only thing that won't handle is modifying the MMU while the
job is running (e.g. faulting in pages). But that can be handled
internally in Panfrost by invalidating the exact region which is being
populated.

Steve

> Robin.
> 
>> +        io_pgtable_tlb_sync(&data->iop);
>> +    } else {
>> +        wmb();
>> +    }
>>
>>       return ret;
>>   }
>> @@ -800,7 +806,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg
>> *cfg, void *cookie)
>>       struct arm_lpae_io_pgtable *data;
>>
>>       if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS |
>> IO_PGTABLE_QUIRK_NO_DMA |
>> -                IO_PGTABLE_QUIRK_NON_STRICT))
>> +                IO_PGTABLE_QUIRK_NON_STRICT |
>> +                IO_PGTABLE_QUIRK_TLBI_ON_MAP))
>>           return NULL;
>>
>>       data = arm_lpae_alloc_pgtable(cfg);
>>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel


  reply index

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-01  7:47 [PATCH v2 0/3] Initial Panfrost driver Rob Herring
2019-04-01  7:47 ` [PATCH v3 1/3] iommu: io-pgtable: Add ARM Mali midgard MMU page table format Rob Herring
2019-04-01 19:11   ` Robin Murphy
2019-04-05 10:02     ` Robin Murphy
2019-04-11 13:15     ` Joerg Roedel
2019-04-05  9:42   ` Steven Price
2019-04-05  9:51     ` Robin Murphy
2019-04-05 10:36       ` Steven Price [this message]
2019-04-08  8:56         ` Steven Price
2019-04-01  7:47 ` [PATCH v2 2/3] drm: Add a drm_gem_objects_lookup helper Rob Herring
2019-04-01 13:06   ` Daniel Vetter
2019-04-01 13:48     ` Chris Wilson
2019-04-01 15:43       ` Eric Anholt
2019-04-08 20:09         ` Rob Herring
2019-04-09 16:55           ` Eric Anholt
2019-04-01 16:59     ` Rob Herring
2019-04-01 18:22       ` Eric Anholt
2019-04-01  7:47 ` [PATCH v2 3/3] drm/panfrost: Add initial panfrost driver Rob Herring
2019-04-01  8:24   ` Neil Armstrong
2019-04-01 19:17     ` Robin Murphy
2019-04-01 16:02   ` Eric Anholt
2019-04-01 19:12   ` Robin Murphy
2019-04-02  0:33     ` Alyssa Rosenzweig
2019-04-02 11:23       ` Robin Murphy
2019-04-03  4:57     ` Rob Herring
2019-04-05 12:57       ` Robin Murphy
2019-04-05 12:30   ` Steven Price
2019-04-05 16:16     ` Alyssa Rosenzweig
2019-04-05 16:42       ` Steven Price
2019-04-05 16:53         ` Alyssa Rosenzweig
2019-04-15  9:18         ` Daniel Vetter
2019-04-15  9:30           ` Steven Price
2019-04-16  7:51             ` Daniel Vetter
2019-04-08 21:04     ` Rob Herring
2019-04-09 15:56       ` Tomeu Vizoso
2019-04-09 16:15         ` Rob Herring
2019-04-10 10:28           ` Steven Price
2019-04-10 10:19       ` Steven Price
2019-04-10 11:50         ` Tomeu Vizoso
2019-04-01 15:05 ` [PATCH v2 0/3] Initial Panfrost driver Alyssa Rosenzweig
2019-04-09 20:54 [PATCH v3 " Rob Herring
2019-04-09 20:54 ` [PATCH v3 1/3] iommu: io-pgtable: Add ARM Mali midgard MMU page table format Rob Herring

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0c5f474e-6dd4-8ba5-8c2b-52a99e9b1ca5@arm.com \
    --to=steven.price@arm.com \
    --cc=airlied@linux.ie \
    --cc=alyssa@rosenzweig.io \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maxime.ripard@bootlin.com \
    --cc=narmstrong@baylibre.com \
    --cc=robh@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=sean@poorly.run \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox