All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
To: Ben Skeggs <bskeggs-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	David Airlie <airlied-cv59FeDIM0c@public.gmane.org>,
	David Herrmann
	<dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Lucas Stach <dev-8ppwABl0HbeELgA04lAiVw@public.gmane.org>,
	Thierry Reding
	<thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Maarten Lankhorst
	<maarten.lankhorst-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: [PATCH v4 0/6] drm: nouveau: memory coherency on ARM
Date: Tue, 8 Jul 2014 17:25:55 +0900	[thread overview]
Message-ID: <1404807961-30530-1-git-send-email-acourbot@nvidia.com> (raw)

Another revision of this patchset critical for GK20A to operate.

Previous attempts were exclusively using either TTM's regular page allocator or 
the DMA API one. Both have their advantages and drawbacks: the page allocator is 
fast but requires explicit synchronization on non-coherent architectures, 
whereas the DMA allocator always returns coherent memory, but is also slower,
creates a permanent kernel mapping, and is more constrained as to which memory
it can use.

This version attempts to use the most-fit allocator according to the buffer 
use-case:
- buffers that are passed to user-space can explicitly be synced during their 
  validation and preparation for CPU access, as previously shown by Lucas 
  (http://lists.freedesktop.org/archives/nouveau/2013-August/014029.html ). For 
  these, we don't mind if the memory is not coherent and prefer to use the page 
  allocator.
- buffers that are used by the kernel, typically fences and GPFIFO buffers, are
  accessed rarely and thus should not trigger a costly flush or cache
  invalidation. For these, we want to guarantee coherent access and use the DMA
  API if necessary.

This series attempts to implement this behavior by allowing the
TTM_PL_FLAG_UNCACHED flag to be passed to nouveau_bo_new(). On coherent 
architectures this flag is a no-op ; on non-coherent architectures, it will 
force the creation of a coherent buffer using the DMA-API.

Several fixes and changes were necessary to enable this behavior:
- CPU addresses of DMA-allocated BOs must be made visible (patch 1) so the 
  coherent mapping can be used by drivers
- The DMA-sync functions are required for BOs populated using the page allocator 
  (patch 4). Pages need to be mapped to the device using the correct API if we
  are to call the sync functions (patch 2). Additionally, we need to understand 
  whether we are on a CPU-coherent architecture (patch 3).
- Coherent BOs need to be detected by Nouveau so their coherent kernel mapping
  can be used instead of creating a new one (patch 5).
- Finally, buffers that are used by the kernel should be requested to be
  coherent (page 6).

Changes since v3:
- Only use the DMA allocator for BOs that strictly require to be coherent
- Fixed the way pages are mapped to the GPU on platform devices
- Thoroughly checked with CONFIG_DMA_API_DEBUG that there were no API violations

Alexandre Courbot (6):
  drm/ttm: expose CPU address of DMA-allocated pages
  drm/nouveau: map pages using DMA API on platform devices
  drm/nouveau: introduce nv_device_is_cpu_coherent()
  drm/nouveau: synchronize BOs when required
  drm/nouveau: implement explicitly coherent BOs
  drm/nouveau: allocate GPFIFOs and fences coherently

 drivers/gpu/drm/nouveau/core/engine/device/base.c  |  14 ++-
 drivers/gpu/drm/nouveau/core/include/core/device.h |   3 +
 drivers/gpu/drm/nouveau/nouveau_bo.c               | 132 +++++++++++++++++++--
 drivers/gpu/drm/nouveau/nouveau_bo.h               |   3 +
 drivers/gpu/drm/nouveau/nouveau_chan.c             |   2 +-
 drivers/gpu/drm/nouveau/nouveau_gem.c              |  12 ++
 drivers/gpu/drm/nouveau/nv84_fence.c               |   4 +-
 drivers/gpu/drm/ttm/ttm_page_alloc_dma.c           |   2 +
 drivers/gpu/drm/ttm/ttm_tt.c                       |   6 +-
 include/drm/ttm/ttm_bo_driver.h                    |   2 +
 10 files changed, 167 insertions(+), 13 deletions(-)

-- 
2.0.0

WARNING: multiple messages have this Message-ID (diff)
From: Alexandre Courbot <acourbot@nvidia.com>
To: Ben Skeggs <bskeggs@redhat.com>, David Airlie <airlied@linux.ie>,
	David Herrmann <dh.herrmann@gmail.com>,
	Lucas Stach <dev@lynxeye.de>,
	Thierry Reding <thierry.reding@gmail.com>,
	Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: <nouveau@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>, <linux-tegra@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <gnurou@gmail.com>,
	Alexandre Courbot <acourbot@nvidia.com>
Subject: [PATCH v4 0/6] drm: nouveau: memory coherency on ARM
Date: Tue, 8 Jul 2014 17:25:55 +0900	[thread overview]
Message-ID: <1404807961-30530-1-git-send-email-acourbot@nvidia.com> (raw)

Another revision of this patchset critical for GK20A to operate.

Previous attempts were exclusively using either TTM's regular page allocator or 
the DMA API one. Both have their advantages and drawbacks: the page allocator is 
fast but requires explicit synchronization on non-coherent architectures, 
whereas the DMA allocator always returns coherent memory, but is also slower,
creates a permanent kernel mapping, and is more constrained as to which memory
it can use.

This version attempts to use the most-fit allocator according to the buffer 
use-case:
- buffers that are passed to user-space can explicitly be synced during their 
  validation and preparation for CPU access, as previously shown by Lucas 
  (http://lists.freedesktop.org/archives/nouveau/2013-August/014029.html ). For 
  these, we don't mind if the memory is not coherent and prefer to use the page 
  allocator.
- buffers that are used by the kernel, typically fences and GPFIFO buffers, are
  accessed rarely and thus should not trigger a costly flush or cache
  invalidation. For these, we want to guarantee coherent access and use the DMA
  API if necessary.

This series attempts to implement this behavior by allowing the
TTM_PL_FLAG_UNCACHED flag to be passed to nouveau_bo_new(). On coherent 
architectures this flag is a no-op ; on non-coherent architectures, it will 
force the creation of a coherent buffer using the DMA-API.

Several fixes and changes were necessary to enable this behavior:
- CPU addresses of DMA-allocated BOs must be made visible (patch 1) so the 
  coherent mapping can be used by drivers
- The DMA-sync functions are required for BOs populated using the page allocator 
  (patch 4). Pages need to be mapped to the device using the correct API if we
  are to call the sync functions (patch 2). Additionally, we need to understand 
  whether we are on a CPU-coherent architecture (patch 3).
- Coherent BOs need to be detected by Nouveau so their coherent kernel mapping
  can be used instead of creating a new one (patch 5).
- Finally, buffers that are used by the kernel should be requested to be
  coherent (page 6).

Changes since v3:
- Only use the DMA allocator for BOs that strictly require to be coherent
- Fixed the way pages are mapped to the GPU on platform devices
- Thoroughly checked with CONFIG_DMA_API_DEBUG that there were no API violations

Alexandre Courbot (6):
  drm/ttm: expose CPU address of DMA-allocated pages
  drm/nouveau: map pages using DMA API on platform devices
  drm/nouveau: introduce nv_device_is_cpu_coherent()
  drm/nouveau: synchronize BOs when required
  drm/nouveau: implement explicitly coherent BOs
  drm/nouveau: allocate GPFIFOs and fences coherently

 drivers/gpu/drm/nouveau/core/engine/device/base.c  |  14 ++-
 drivers/gpu/drm/nouveau/core/include/core/device.h |   3 +
 drivers/gpu/drm/nouveau/nouveau_bo.c               | 132 +++++++++++++++++++--
 drivers/gpu/drm/nouveau/nouveau_bo.h               |   3 +
 drivers/gpu/drm/nouveau/nouveau_chan.c             |   2 +-
 drivers/gpu/drm/nouveau/nouveau_gem.c              |  12 ++
 drivers/gpu/drm/nouveau/nv84_fence.c               |   4 +-
 drivers/gpu/drm/ttm/ttm_page_alloc_dma.c           |   2 +
 drivers/gpu/drm/ttm/ttm_tt.c                       |   6 +-
 include/drm/ttm/ttm_bo_driver.h                    |   2 +
 10 files changed, 167 insertions(+), 13 deletions(-)

-- 
2.0.0


             reply	other threads:[~2014-07-08  8:25 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-08  8:25 Alexandre Courbot [this message]
2014-07-08  8:25 ` [PATCH v4 0/6] drm: nouveau: memory coherency on ARM Alexandre Courbot
     [not found] ` <1404807961-30530-1-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-07-08  8:25   ` [PATCH v4 1/6] drm/ttm: expose CPU address of DMA-allocated pages Alexandre Courbot
2014-07-08  8:25     ` Alexandre Courbot
2014-07-08  8:25   ` [PATCH v4 2/6] drm/nouveau: map pages using DMA API on platform devices Alexandre Courbot
2014-07-08  8:25     ` Alexandre Courbot
     [not found]     ` <1404807961-30530-3-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-07-10 12:58       ` Daniel Vetter
2014-07-10 12:58         ` [Nouveau] " Daniel Vetter
     [not found]         ` <20140710125849.GF17271-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2014-07-11  2:35           ` Alexandre Courbot
2014-07-11  2:35             ` Alexandre Courbot
     [not found]             ` <53BF4D6B.70904-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-07-11  2:50               ` Ben Skeggs
2014-07-11  2:50                 ` [Nouveau] " Ben Skeggs
     [not found]                 ` <CACAvsv7eER4VmbR81Ym=YE7fQZ9cNuJsb5372SAuSX+PQfYyrQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-07-11  2:57                   ` Alexandre Courbot
2014-07-11  2:57                     ` Alexandre Courbot
2014-07-11  9:53                     ` Lucas Stach
2014-07-11  9:53                       ` Lucas Stach
2014-07-11  7:38             ` Daniel Vetter
2014-07-11  7:38               ` Daniel Vetter
2014-07-08  8:25   ` [PATCH v4 3/6] drm/nouveau: introduce nv_device_is_cpu_coherent() Alexandre Courbot
2014-07-08  8:25     ` Alexandre Courbot
2014-07-08  8:25   ` [PATCH v4 4/6] drm/nouveau: synchronize BOs when required Alexandre Courbot
2014-07-08  8:25     ` Alexandre Courbot
2014-07-10 13:04     ` [Nouveau] " Daniel Vetter
2014-07-10 13:04       ` Daniel Vetter
     [not found]       ` <20140710130449.GG17271-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2014-07-11  2:40         ` Alexandre Courbot
2014-07-11  2:40           ` Alexandre Courbot
     [not found]           ` <53BF4E9B.7090606-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2014-07-11  7:41             ` Daniel Vetter
2014-07-11  7:41               ` [Nouveau] " Daniel Vetter
     [not found]               ` <20140711074138.GW17271-dv86pmgwkMBes7Z6vYuT8azUEOm+Xw19@public.gmane.org>
2014-07-11  9:35                 ` Alexandre Courbot
2014-07-11  9:35                   ` [Nouveau] " Alexandre Courbot
2014-07-08  8:26   ` [PATCH v4 5/6] drm/nouveau: implement explicitly coherent BOs Alexandre Courbot
2014-07-08  8:26     ` Alexandre Courbot
2014-07-08  8:26 ` [PATCH v4 6/6] drm/nouveau: allocate GPFIFOs and fences coherently Alexandre Courbot
2014-07-08  8:26   ` Alexandre Courbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1404807961-30530-1-git-send-email-acourbot@nvidia.com \
    --to=acourbot-ddmlm1+adcrqt0dzr+alfa@public.gmane.org \
    --cc=airlied-cv59FeDIM0c@public.gmane.org \
    --cc=bskeggs-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=dev-8ppwABl0HbeELgA04lAiVw@public.gmane.org \
    --cc=dh.herrmann-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-tegra-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=maarten.lankhorst-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org \
    --cc=nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org \
    --cc=thierry.reding-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.