All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rob Clark <robdclark@gmail.com>
To: Jonathan Marek <jonathan@marek.ca>
Cc: Christoph Hellwig <hch@lst.de>,
	freedreno <freedreno@lists.freedesktop.org>,
	Sean Paul <sean@poorly.run>, David Airlie <airlied@linux.ie>,
	Daniel Vetter <daniel@ffwll.ch>,
	"open list:DRM DRIVER FOR MSM ADRENO GPU" 
	<linux-arm-msm@vger.kernel.org>,
	"open list:DRM DRIVER FOR MSM ADRENO GPU" 
	<dri-devel@lists.freedesktop.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [RESEND PATCH v2 4/5] drm/msm: add DRM_MSM_GEM_SYNC_CACHE for non-coherent cache maintenance
Date: Sat, 14 Nov 2020 12:48:15 -0800	[thread overview]
Message-ID: <CAF6AEGv7fXGVVWcTcSXF6EGB2LOi_wvQP6h6hcX8yNvAZRDbVg@mail.gmail.com> (raw)
In-Reply-To: <b6e4f167-871a-5f26-46bd-d914476af519@marek.ca>

On Sat, Nov 14, 2020 at 12:10 PM Jonathan Marek <jonathan@marek.ca> wrote:
>
> On 11/14/20 2:39 PM, Rob Clark wrote:
> > On Sat, Nov 14, 2020 at 10:58 AM Jonathan Marek <jonathan@marek.ca> wrote:
> >>
> >> On 11/14/20 1:46 PM, Rob Clark wrote:
> >>> On Sat, Nov 14, 2020 at 8:24 AM Christoph Hellwig <hch@lst.de> wrote:
> >>>>
> >>>> On Sat, Nov 14, 2020 at 10:17:12AM -0500, Jonathan Marek wrote:
> >>>>> +void msm_gem_sync_cache(struct drm_gem_object *obj, uint32_t flags,
> >>>>> +             size_t range_start, size_t range_end)
> >>>>> +{
> >>>>> +     struct msm_gem_object *msm_obj = to_msm_bo(obj);
> >>>>> +     struct device *dev = msm_obj->base.dev->dev;
> >>>>> +
> >>>>> +     /* exit early if get_pages() hasn't been called yet */
> >>>>> +     if (!msm_obj->pages)
> >>>>> +             return;
> >>>>> +
> >>>>> +     /* TODO: sync only the specified range */
> >>>>> +
> >>>>> +     if (flags & MSM_GEM_SYNC_FOR_DEVICE) {
> >>>>> +             dma_sync_sg_for_device(dev, msm_obj->sgt->sgl,
> >>>>> +                             msm_obj->sgt->nents, DMA_TO_DEVICE);
> >>>>> +     }
> >>>>> +
> >>>>> +     if (flags & MSM_GEM_SYNC_FOR_CPU) {
> >>>>> +             dma_sync_sg_for_cpu(dev, msm_obj->sgt->sgl,
> >>>>> +                             msm_obj->sgt->nents, DMA_FROM_DEVICE);
> >>>>> +     }
> >>>>
> >>>> Splitting this helper from the only caller is rather strange, epecially
> >>>> with the two unused arguments.  And I think the way this is specified
> >>>> to take a range, but ignoring it is actively dangerous.  User space will
> >>>> rely on it syncing everything sooner or later and then you are stuck.
> >>>> So just define a sync all primitive for now, and if you really need a
> >>>> range sync and have actually implemented it add a new ioctl for that.
> >>>
> >>> We do already have a split of ioctl "layer" which enforces valid ioctl
> >>> params, etc, and gem (or other) module code which is called by the
> >>> ioctl func.  So I think it is fine to keep this split here.  (Also, I
> >>> think at some point there will be a uring type of ioctl alternative
> >>> which would re-use the same gem func.)
> >>>
> >>> But I do agree that the range should be respected or added later..
> >>> drm_ioctl() dispatch is well prepared for extending ioctls.
> >>>
> >>> And I assume there should be some validation that the range is aligned
> >>> to cache-line?  Or can we flush a partial cache line?
> >>>
> >>
> >> The range is intended to be "sync at least this range", so that
> >> userspace doesn't have to worry about details like that.
> >>
> >
> > I don't think userspace can *not* worry about details like that.
> > Consider a case where the cpu and gpu are simultaneously accessing
> > different parts of a buffer (for ex, sub-allocation).  There needs to
> > be cache-line separation between the two.
> >
>
> Right.. and it also seems like we can't get away with just
> flushing/invalidating the whole thing.
>
> qcom's vulkan driver has nonCoherentAtomSize=1, and it looks like
> dma_sync_single_for_cpu() does deal in some way with the partial cache
> line case, although I'm not sure that means we can have a
> nonCoherentAtomSize=1.
>

flush/inv the whole thing could be a useful first step, or at least I
can think of some uses for it.  But if it isn't useful for how vk sees
the world, then maybe we should just implement the range properly from
the get-go.  (And I *think* requiring the range to be aligned to
cacheline boundaries.. it is always easy from a kernel uabi PoV to
loosen restrictions later, than the other way around.)

BR,
-R

WARNING: multiple messages have this Message-ID (diff)
From: Rob Clark <robdclark@gmail.com>
To: Jonathan Marek <jonathan@marek.ca>
Cc: David Airlie <airlied@linux.ie>,
	freedreno <freedreno@lists.freedesktop.org>,
	open list <linux-kernel@vger.kernel.org>,
	"open list:DRM DRIVER FOR MSM ADRENO GPU"
	<dri-devel@lists.freedesktop.org>,
	"open list:DRM DRIVER FOR MSM ADRENO GPU"
	<linux-arm-msm@vger.kernel.org>, Sean Paul <sean@poorly.run>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [RESEND PATCH v2 4/5] drm/msm: add DRM_MSM_GEM_SYNC_CACHE for non-coherent cache maintenance
Date: Sat, 14 Nov 2020 12:48:15 -0800	[thread overview]
Message-ID: <CAF6AEGv7fXGVVWcTcSXF6EGB2LOi_wvQP6h6hcX8yNvAZRDbVg@mail.gmail.com> (raw)
In-Reply-To: <b6e4f167-871a-5f26-46bd-d914476af519@marek.ca>

On Sat, Nov 14, 2020 at 12:10 PM Jonathan Marek <jonathan@marek.ca> wrote:
>
> On 11/14/20 2:39 PM, Rob Clark wrote:
> > On Sat, Nov 14, 2020 at 10:58 AM Jonathan Marek <jonathan@marek.ca> wrote:
> >>
> >> On 11/14/20 1:46 PM, Rob Clark wrote:
> >>> On Sat, Nov 14, 2020 at 8:24 AM Christoph Hellwig <hch@lst.de> wrote:
> >>>>
> >>>> On Sat, Nov 14, 2020 at 10:17:12AM -0500, Jonathan Marek wrote:
> >>>>> +void msm_gem_sync_cache(struct drm_gem_object *obj, uint32_t flags,
> >>>>> +             size_t range_start, size_t range_end)
> >>>>> +{
> >>>>> +     struct msm_gem_object *msm_obj = to_msm_bo(obj);
> >>>>> +     struct device *dev = msm_obj->base.dev->dev;
> >>>>> +
> >>>>> +     /* exit early if get_pages() hasn't been called yet */
> >>>>> +     if (!msm_obj->pages)
> >>>>> +             return;
> >>>>> +
> >>>>> +     /* TODO: sync only the specified range */
> >>>>> +
> >>>>> +     if (flags & MSM_GEM_SYNC_FOR_DEVICE) {
> >>>>> +             dma_sync_sg_for_device(dev, msm_obj->sgt->sgl,
> >>>>> +                             msm_obj->sgt->nents, DMA_TO_DEVICE);
> >>>>> +     }
> >>>>> +
> >>>>> +     if (flags & MSM_GEM_SYNC_FOR_CPU) {
> >>>>> +             dma_sync_sg_for_cpu(dev, msm_obj->sgt->sgl,
> >>>>> +                             msm_obj->sgt->nents, DMA_FROM_DEVICE);
> >>>>> +     }
> >>>>
> >>>> Splitting this helper from the only caller is rather strange, epecially
> >>>> with the two unused arguments.  And I think the way this is specified
> >>>> to take a range, but ignoring it is actively dangerous.  User space will
> >>>> rely on it syncing everything sooner or later and then you are stuck.
> >>>> So just define a sync all primitive for now, and if you really need a
> >>>> range sync and have actually implemented it add a new ioctl for that.
> >>>
> >>> We do already have a split of ioctl "layer" which enforces valid ioctl
> >>> params, etc, and gem (or other) module code which is called by the
> >>> ioctl func.  So I think it is fine to keep this split here.  (Also, I
> >>> think at some point there will be a uring type of ioctl alternative
> >>> which would re-use the same gem func.)
> >>>
> >>> But I do agree that the range should be respected or added later..
> >>> drm_ioctl() dispatch is well prepared for extending ioctls.
> >>>
> >>> And I assume there should be some validation that the range is aligned
> >>> to cache-line?  Or can we flush a partial cache line?
> >>>
> >>
> >> The range is intended to be "sync at least this range", so that
> >> userspace doesn't have to worry about details like that.
> >>
> >
> > I don't think userspace can *not* worry about details like that.
> > Consider a case where the cpu and gpu are simultaneously accessing
> > different parts of a buffer (for ex, sub-allocation).  There needs to
> > be cache-line separation between the two.
> >
>
> Right.. and it also seems like we can't get away with just
> flushing/invalidating the whole thing.
>
> qcom's vulkan driver has nonCoherentAtomSize=1, and it looks like
> dma_sync_single_for_cpu() does deal in some way with the partial cache
> line case, although I'm not sure that means we can have a
> nonCoherentAtomSize=1.
>

flush/inv the whole thing could be a useful first step, or at least I
can think of some uses for it.  But if it isn't useful for how vk sees
the world, then maybe we should just implement the range properly from
the get-go.  (And I *think* requiring the range to be aligned to
cacheline boundaries.. it is always easy from a kernel uabi PoV to
loosen restrictions later, than the other way around.)

BR,
-R
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2020-11-14 20:48 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-14 15:17 [RESEND PATCH v2 0/5] drm/msm: support for host-cached BOs Jonathan Marek
2020-11-14 15:17 ` Jonathan Marek
2020-11-14 15:17 ` Jonathan Marek
2020-11-14 15:17 ` [RESEND PATCH v2 1/5] drm/msm: add MSM_BO_CACHED_COHERENT Jonathan Marek
2020-11-14 15:17   ` Jonathan Marek
2020-11-14 15:17 ` [RESEND PATCH v2 2/5] dma-direct: add dma_direct_bypass() to force direct ops Jonathan Marek
2020-11-14 15:17   ` Jonathan Marek
2020-11-14 16:21   ` Christoph Hellwig
2020-11-14 16:21     ` Christoph Hellwig
2020-11-14 15:17 ` [RESEND PATCH v2 3/5] drm/msm: call dma_direct_bypass() Jonathan Marek
2020-11-14 15:17   ` Jonathan Marek
2020-11-14 16:21   ` Christoph Hellwig
2020-11-14 15:17 ` [RESEND PATCH v2 4/5] drm/msm: add DRM_MSM_GEM_SYNC_CACHE for non-coherent cache maintenance Jonathan Marek
2020-11-14 15:17   ` Jonathan Marek
2020-11-14 16:24   ` Christoph Hellwig
2020-11-14 18:46     ` Rob Clark
2020-11-14 18:46       ` Rob Clark
2020-11-14 18:54       ` Jonathan Marek
2020-11-14 18:54         ` Jonathan Marek
2020-11-14 19:39         ` Rob Clark
2020-11-14 19:39           ` Rob Clark
2020-11-14 20:07           ` Jonathan Marek
2020-11-14 20:07             ` Jonathan Marek
2020-11-14 20:48             ` Rob Clark [this message]
2020-11-14 20:48               ` Rob Clark
2020-11-16 17:33             ` Christoph Hellwig
2020-11-16 17:50               ` Rob Clark
2020-11-16 17:50                 ` Rob Clark
2020-11-16 17:52                 ` Jonathan Marek
2020-11-16 17:52                   ` Jonathan Marek
2020-11-29 18:51                   ` Rob Clark
2020-11-29 18:51                     ` Rob Clark
2020-11-16 17:27           ` [Freedreno] " Jordan Crouse
2020-11-16 17:27             ` Jordan Crouse
2020-11-16 17:25   ` Jordan Crouse
2020-11-16 17:25     ` Jordan Crouse
2020-11-14 15:17 ` [RESEND PATCH v2 5/5] drm/msm: bump up the uapi version Jonathan Marek
2020-11-14 15:17   ` Jonathan Marek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAF6AEGv7fXGVVWcTcSXF6EGB2LOi_wvQP6h6hcX8yNvAZRDbVg@mail.gmail.com \
    --to=robdclark@gmail.com \
    --cc=airlied@linux.ie \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=freedreno@lists.freedesktop.org \
    --cc=hch@lst.de \
    --cc=jonathan@marek.ca \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sean@poorly.run \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.