All of lore.kernel.org
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Rob Herring <robh@kernel.org>, Steven Price <steven.price@arm.com>
Cc: dri-devel <dri-devel@lists.freedesktop.org>,
	Boris Brezillon <boris.brezillon@collabora.com>,
	Alyssa Rosenzweig <alyssa@rosenzweig.io>,
	Tomeu Vizoso <tomeu.vizoso@collabora.com>
Subject: Re: [PATCH 3/5] drm/panfrost: Add a no execute flag for BO allocations
Date: Mon, 22 Jul 2019 10:50:07 +0100	[thread overview]
Message-ID: <4b7fc0b4-aa5b-06ba-ad4a-5b959e265e67@arm.com> (raw)
In-Reply-To: <CAL_Jsq+ygY64WP6GP2LB4WRt2_BCXMMWxQSyhazY=jWfCyOkLg@mail.gmail.com>

On 19/07/2019 23:07, Rob Herring wrote:
> On Fri, Jul 19, 2019 at 4:39 AM Steven Price <steven.price@arm.com> wrote:
>>
>> On 18/07/2019 18:03, Rob Herring wrote:
>>> On Thu, Jul 18, 2019 at 9:03 AM Steven Price <steven.price@arm.com> wrote:
>>>>
>>>> On 17/07/2019 19:33, Rob Herring wrote:
>>>>> Executable buffers have an alignment restriction that they can't cross
>>>>> 16MB boundary as the GPU program counter is 24-bits. This restriction is
>>>>> currently not handled and we just get lucky. As current userspace
>>>>
>>>> Actually it depends which GPU you are using - some do have a bigger
>>>> program counter - so it might be the choice of GPU to test on rather
>>>> than just luck. But kbase always assumes the worst case (24 bit) as in
>>>> practise that's enough.
>>>>
>>>>> assumes all BOs are executable, that has to remain the default. So add a
>>>>> new PANFROST_BO_NOEXEC flag to allow userspace to indicate which BOs are
>>>>> not executable.
>>>>>
>>>>> There is also a restriction that executable buffers cannot start or end
>>>>> on a 4GB boundary. This is mostly avoided as there is only 4GB of space
>>>>> currently and the beginning is already blocked out for NULL ptr
>>>>> detection. Add support to handle this restriction fully regardless of
>>>>> the current constraints.
>>>>>
>>>>> For existing userspace, all created BOs remain executable, but the GPU
>>>>> VA alignment will be increased to the size of the BO. This shouldn't
>>>>> matter as there is plenty of GPU VA space.
>>>>>
>>>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
>>>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
>>>>> Cc: Robin Murphy <robin.murphy@arm.com>
>>>>> Cc: Steven Price <steven.price@arm.com>
>>>>> Cc: Alyssa Rosenzweig <alyssa@rosenzweig.io>
>>>>> Signed-off-by: Rob Herring <robh@kernel.org>
>>>>> ---
>>>>>   drivers/gpu/drm/panfrost/panfrost_drv.c | 20 +++++++++++++++++++-
>>>>>   drivers/gpu/drm/panfrost/panfrost_gem.c | 18 ++++++++++++++++--
>>>>>   drivers/gpu/drm/panfrost/panfrost_gem.h |  3 ++-
>>>>>   drivers/gpu/drm/panfrost/panfrost_mmu.c |  3 +++
>>>>>   include/uapi/drm/panfrost_drm.h         |  2 ++
>>>>>   5 files changed, 42 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
>>>>> index d354b92964d5..b91e991bc6a3 100644
>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
>>>>> @@ -49,7 +49,8 @@ static int panfrost_ioctl_create_bo(struct drm_device *dev, void *data,
>>>>>        struct panfrost_gem_object *bo;
>>>>>        struct drm_panfrost_create_bo *args = data;
>>>>>
>>>>> -     if (!args->size || args->flags || args->pad)
>>>>> +     if (!args->size || args->pad ||
>>>>> +         (args->flags & ~PANFROST_BO_NOEXEC))
>>>>>                return -EINVAL;
>>>>>
>>>>>        bo = panfrost_gem_create_with_handle(file, dev, args->size, args->flags,
>>>>> @@ -367,6 +368,22 @@ static struct drm_driver panfrost_drm_driver = {
>>>>>        .gem_prime_mmap         = drm_gem_prime_mmap,
>>>>>   };
>>>>>
>>>>> +#define PFN_4G_MASK    ((SZ_4G - 1) >> PAGE_SHIFT)
>>>>> +
>>>>> +static void panfrost_drm_mm_color_adjust(const struct drm_mm_node *node,
>>>>> +                                      unsigned long color,
>>>>> +                                      u64 *start, u64 *end)
>>>>> +{
>>>>> +     /* Executable buffers can't start or end on a 4GB boundary */
>>>>> +     if (!color) {
>>>>> +             if ((*start & PFN_4G_MASK) == 0)
>>>>> +                     (*start)++;
>>>>> +
>>>>> +             if ((*end & PFN_4G_MASK) == 0)
>>>>> +                     (*end)--;
>>>>> +     }
>>>>> +}
>>>>
>>>> Unless I'm mistaken this won't actually provide the guarantee if the
>>>> memory region is >4GB (which admittedly it isn't at the moment). For
>>>> example a 8GB region would have the beginning/end trimmed off, but
>>>> there's another 4GB in the middle which could be allocated next to.
>>>
>>> Humm, other than not allowing sizes greater than 4G-(2*PAGE_SIZE), I'm
>>> not sure how we solve that. I guess avoiding sizes of (n*4G -
>>> PAGE_SIZE) would avoid the problem. Seems like almost 4GB executable
>>> buffer should be enough for anyone(TM).
>>
>> I was thinking of something like:
>>
>> if ((*end & ~PFN_4G_MASK) != (*start & ~PFN_4G_MASK)) {
>>          if ((*start & PFN_4G_MASK) +
>>              (SZ_16M >> PAGE_SHIFT) < PFN_4G_MASK)
>>                  *end = (*start & ~PFN_4G_MASK) +
>>                          (SZ_4GB - SZ_16M) >> PAGE_SHIFT;
>>          else
>>                  *start = (*end & ~PFN_4G_MASK) + (SZ_16M) >> PAGE_SHIFT;
>> }
>>
>> So split the region depending on where we can find a free 16MB region
>> which doesn't cross 4GB.
> 
> Here's what I ended up with. It's slightly different in that the start
> and end don't get 16MB aligned. The code already takes care of the
> alignment which also is not necessarily 16MB, but 'align = size' as
> that's sufficient to not cross a 16MB boundary.
> 
> #define PFN_4G          (SZ_4G >> PAGE_SHIFT)
> #define PFN_4G_MASK     (PFN_4G - 1)
> #define PFN_16M         (SZ_16M >> PAGE_SHIFT)
> 
> static void panfrost_drm_mm_color_adjust(const struct drm_mm_node *node,
>                                           unsigned long color,
>                                           u64 *start, u64 *end)
> {
>          /* Executable buffers can't start or end on a 4GB boundary */
>          if (!(color & PANFROST_BO_NOEXEC)) {
>                  if ((*start & PFN_4G_MASK) == 0)
>                          (*start)++;
> 
>                  if ((*end & PFN_4G_MASK) == 0)
>                          (*end)--;
> 
>                  /* Ensure start and end are in the same 4GB range */
>                  if ((*end & ~PFN_4G_MASK) != (*start & ~PFN_4G_MASK)) {
>                          if ((*start & PFN_4G_MASK) + PFN_16M < PFN_4G_MASK)
>                                  *end = (*start & ~PFN_4G_MASK) + PFN_4G - 1;
>                          else
>                                  *start = (*end & ~PFN_4G_MASK) + 1;
> 
>                  }
>          }
> }

If you're happy with the additional restriction that a buffer can't 
cross a 4GB boundary (which does seem to be required for certain kinds 
of buffers anyway), then I don't think it needs to be anywhere near that 
complex:

	if (!(*start & PFN_4G_MASK))
		*start++
	*end = min(*end, ALIGN(*start, PFN_4G) - 1);

>>
>> But like you say: 4GB should be enough for anyone ;)
>>
>>>> Also a minor issue, but we might want to consider having some constants
>>>> for 'color' - it's not obvious from this code that color==no_exec.
>>>
>>> Yeah, I was just going worry about that when we had a second bit to pass.
>>
>> One other 'minor' issue I noticed as I was writing the above. PAGE_SIZE
>> is of course the CPU page size - we could have the situation of CPU and
>> GPU page size being different (e.g. CONFIG_ARM64_64K_PAGES). I'm not
>> sure whether we want to support this or not (kbase doesn't). Also in
>> theory some GPUs do support 64K (AS_TRANSCFG_ADRMODE_AARCH64_64K) - but
>> kbase has never used this so I don't know if it works... ;)
> 
> Shhh! I have thought about that, but was ignoring for now.

In general, I don't think that should be too great a concern - we 
already handle it for IOMMUs, provided the minimum IOMMU page size is no 
larger than PAGE_SIZE, which should always be the case here too. Looking 
at how panfrost_mmu_map() is implemented, and given that we're already 
trying to handle stuff at 2MB granularity where possible, I don't see 
any obvious reasons for 64K pages not to work already (assuming there 
aren't any 4K assumptions baked into the userspace side). I might have 
to give it a try...

Robin.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  parent reply	other threads:[~2019-07-22  9:50 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-17 18:33 [PATCH 1/5] drm/panfrost: Restructure the GEM object creation Rob Herring
2019-07-17 18:33 ` [PATCH 2/5] drm/panfrost: Split panfrost_mmu_map SG list mapping to its own function Rob Herring
2019-07-18 15:03   ` Steven Price
2019-07-17 18:33 ` [PATCH 3/5] drm/panfrost: Add a no execute flag for BO allocations Rob Herring
     [not found]   ` <ecde43d2-45cc-d00a-9635-cb56a67263d4@arm.com>
2019-07-18 17:03     ` Rob Herring
2019-07-19 10:39       ` Steven Price
2019-07-19 22:07         ` Rob Herring
2019-07-22  9:45           ` Steven Price
2019-07-22  9:50           ` Robin Murphy [this message]
2019-07-22 10:07             ` Steven Price
2019-07-22 12:09               ` Robin Murphy
2019-07-22 12:19                 ` Steven Price
2019-07-22 13:25                   ` Robin Murphy
2019-07-22 16:18                     ` Rob Herring
2019-07-22 14:08   ` Alyssa Rosenzweig
2019-07-17 18:33 ` [PATCH 4/5] drm/panfrost: Add support for GPU heap allocations Rob Herring
2019-07-18 15:03   ` Steven Price
2019-07-19 14:27     ` Rob Herring
2019-07-19 14:45       ` Steven Price
2019-07-22 14:12         ` Alyssa Rosenzweig
2019-07-23  9:27           ` Tomeu Vizoso
2019-07-22 14:10     ` Alyssa Rosenzweig
     [not found]   ` <20190722141536.GF2156@rosenzweig.io>
2019-07-22 16:33     ` Rob Herring
2019-07-17 18:33 ` [PATCH 5/5] drm/panfrost: Bump driver version to 1.1 Rob Herring
     [not found] ` <9a01262c-eb29-5e48-cf94-4e9597ea414c@arm.com>
2019-07-19 22:22   ` [PATCH 1/5] drm/panfrost: Restructure the GEM object creation Rob Herring

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4b7fc0b4-aa5b-06ba-ad4a-5b959e265e67@arm.com \
    --to=robin.murphy@arm.com \
    --cc=alyssa@rosenzweig.io \
    --cc=boris.brezillon@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=robh@kernel.org \
    --cc=steven.price@arm.com \
    --cc=tomeu.vizoso@collabora.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.