All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] gem: allow user-space to specify an object should be coherent
@ 2015-02-26  3:44 Alexandre Courbot
       [not found] ` <1424922292-20688-1-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Alexandre Courbot @ 2015-02-26  3:44 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

User-space use mappable BOs notably for fences, and expects that a
value update by the GPU will be immediatly visible through the
user-space mapping.

ARM has a property that may prevent this from happening though: memory
can be mapped multiple times only if the different mappings share the
same caching properties. However all the lowmem memory is already
identity-mapped into the kernel with cache enabled, so when user-space
requests an uncached mapping, we actually get an "undefined caching
policy" one and this has strange side-effects described on Freedesktop
bug 86690.

To prevent this from happening, allow user-space to explicitly specify
which objects should be coherent, and create such objects with the
TTM_PL_FLAG_UNCACHED flag. This will make TTM allocate memory using the
DMA API, which will fix the identify mapping and allow us to safely map
the objects to user-space uncached.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
Patches that take advantage of this in Mesa will follow up shortly. I'd
to make sure the new flag is ok first before also adding it to libdrm.

 drm/nouveau/include/uapi/drm/nouveau_drm.h | 1 +
 drm/nouveau/nouveau_gem.c                  | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/drm/nouveau/include/uapi/drm/nouveau_drm.h b/drm/nouveau/include/uapi/drm/nouveau_drm.h
index 0d7608dc1a34..5507eead5863 100644
--- a/drm/nouveau/include/uapi/drm/nouveau_drm.h
+++ b/drm/nouveau/include/uapi/drm/nouveau_drm.h
@@ -39,6 +39,7 @@
 #define NOUVEAU_GEM_DOMAIN_VRAM      (1 << 1)
 #define NOUVEAU_GEM_DOMAIN_GART      (1 << 2)
 #define NOUVEAU_GEM_DOMAIN_MAPPABLE  (1 << 3)
+#define NOUVEAU_GEM_DOMAIN_COHERENT  (1 << 4)
 
 #define NOUVEAU_GEM_TILE_COMP        0x00030000 /* nv50-only */
 #define NOUVEAU_GEM_TILE_LAYOUT_MASK 0x0000ff00
diff --git a/drm/nouveau/nouveau_gem.c b/drm/nouveau/nouveau_gem.c
index 7c077fced1d1..0e690bf19fc9 100644
--- a/drm/nouveau/nouveau_gem.c
+++ b/drm/nouveau/nouveau_gem.c
@@ -189,6 +189,9 @@ nouveau_gem_new(struct drm_device *dev, int size, int align, uint32_t domain,
 	if (!flags || domain & NOUVEAU_GEM_DOMAIN_CPU)
 		flags |= TTM_PL_FLAG_SYSTEM;
 
+	if (domain & NOUVEAU_GEM_DOMAIN_COHERENT)
+		flags |= TTM_PL_FLAG_UNCACHED;
+
 	ret = nouveau_bo_new(dev, size, align, flags, tile_mode,
 			     tile_flags, NULL, NULL, pnvbo);
 	if (ret)
-- 
2.3.0

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH] instmem/gk20a: use roundup() macro
       [not found] ` <1424922292-20688-1-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2015-02-26  3:44   ` Alexandre Courbot
  2015-02-26  8:36   ` [Nouveau] [PATCH] gem: allow user-space to specify an object should be coherent Lucas Stach
  1 sibling, 0 replies; 3+ messages in thread
From: Alexandre Courbot @ 2015-02-26  3:44 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Use the roundup() macro to make code easier to read and fix a warning
when the driver is compiled for 64 bit architectures.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
---
Ben, this should probably be squashed into patch 6/6 of my "RAM device
removal & IOMMU support" series, since it is not merged yet.

 drm/nouveau/nvkm/subdev/instmem/gk20a.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drm/nouveau/nvkm/subdev/instmem/gk20a.c b/drm/nouveau/nvkm/subdev/instmem/gk20a.c
index a31196b6da8f..fcba72eb74a3 100644
--- a/drm/nouveau/nvkm/subdev/instmem/gk20a.c
+++ b/drm/nouveau/nvkm/subdev/instmem/gk20a.c
@@ -335,8 +335,8 @@ gk20a_instobj_ctor(struct nvkm_object *parent, struct nvkm_object *engine,
 		 priv->domain ? "IOMMU" : "DMA", args->size, args->align);
 
 	/* Round size and align to page bounds */
-	size = max((args->size  + ~PAGE_MASK) & PAGE_MASK, (u32)PAGE_SIZE);
-	align = max((args->align + ~PAGE_MASK) & PAGE_MASK, (u32)PAGE_SIZE);
+	size = max(roundup(args->size, PAGE_SIZE), PAGE_SIZE);
+	align = max(roundup(args->align, PAGE_SIZE), PAGE_SIZE);
 
 	if (priv->domain)
 		ret = gk20a_instobj_ctor_iommu(parent, engine, oclass,
-- 
2.3.0

_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [Nouveau] [PATCH] gem: allow user-space to specify an object should be coherent
       [not found] ` <1424922292-20688-1-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
  2015-02-26  3:44   ` [PATCH] instmem/gk20a: use roundup() macro Alexandre Courbot
@ 2015-02-26  8:36   ` Lucas Stach
  1 sibling, 0 replies; 3+ messages in thread
From: Lucas Stach @ 2015-02-26  8:36 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Ben Skeggs, linux-tegra-u79uwXL29TY76Z2rM5mHXA,
	nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Am Donnerstag, den 26.02.2015, 12:44 +0900 schrieb Alexandre Courbot:
> User-space use mappable BOs notably for fences, and expects that a
> value update by the GPU will be immediatly visible through the
> user-space mapping.
> 
> ARM has a property that may prevent this from happening though: memory
> can be mapped multiple times only if the different mappings share the
> same caching properties. However all the lowmem memory is already
> identity-mapped into the kernel with cache enabled, so when user-space
> requests an uncached mapping, we actually get an "undefined caching
> policy" one and this has strange side-effects described on Freedesktop
> bug 86690.
> 
> To prevent this from happening, allow user-space to explicitly specify
> which objects should be coherent, and create such objects with the
> TTM_PL_FLAG_UNCACHED flag. This will make TTM allocate memory using the
> DMA API, which will fix the identify mapping and allow us to safely map
> the objects to user-space uncached.
> 
> Signed-off-by: Alexandre Courbot <acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>

Ok, this is only needed as userspace is skipping the cpu_prep for the
fence BO reads. As doing this would increase the userspace fence
overhead a lot, this flag seems to be the right thing to do.

Reviewed-by: Lucas Stach <dev-8ppwABl0HbeELgA04lAiVw@public.gmane.org>

> ---
> Patches that take advantage of this in Mesa will follow up shortly. I'd
> to make sure the new flag is ok first before also adding it to libdrm.
> 
>  drm/nouveau/include/uapi/drm/nouveau_drm.h | 1 +
>  drm/nouveau/nouveau_gem.c                  | 3 +++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drm/nouveau/include/uapi/drm/nouveau_drm.h b/drm/nouveau/include/uapi/drm/nouveau_drm.h
> index 0d7608dc1a34..5507eead5863 100644
> --- a/drm/nouveau/include/uapi/drm/nouveau_drm.h
> +++ b/drm/nouveau/include/uapi/drm/nouveau_drm.h
> @@ -39,6 +39,7 @@
>  #define NOUVEAU_GEM_DOMAIN_VRAM      (1 << 1)
>  #define NOUVEAU_GEM_DOMAIN_GART      (1 << 2)
>  #define NOUVEAU_GEM_DOMAIN_MAPPABLE  (1 << 3)
> +#define NOUVEAU_GEM_DOMAIN_COHERENT  (1 << 4)
>  
>  #define NOUVEAU_GEM_TILE_COMP        0x00030000 /* nv50-only */
>  #define NOUVEAU_GEM_TILE_LAYOUT_MASK 0x0000ff00
> diff --git a/drm/nouveau/nouveau_gem.c b/drm/nouveau/nouveau_gem.c
> index 7c077fced1d1..0e690bf19fc9 100644
> --- a/drm/nouveau/nouveau_gem.c
> +++ b/drm/nouveau/nouveau_gem.c
> @@ -189,6 +189,9 @@ nouveau_gem_new(struct drm_device *dev, int size, int align, uint32_t domain,
>  	if (!flags || domain & NOUVEAU_GEM_DOMAIN_CPU)
>  		flags |= TTM_PL_FLAG_SYSTEM;
>  
> +	if (domain & NOUVEAU_GEM_DOMAIN_COHERENT)
> +		flags |= TTM_PL_FLAG_UNCACHED;
> +
>  	ret = nouveau_bo_new(dev, size, align, flags, tile_mode,
>  			     tile_flags, NULL, NULL, pnvbo);
>  	if (ret)

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-02-26  8:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-26  3:44 [PATCH] gem: allow user-space to specify an object should be coherent Alexandre Courbot
     [not found] ` <1424922292-20688-1-git-send-email-acourbot-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2015-02-26  3:44   ` [PATCH] instmem/gk20a: use roundup() macro Alexandre Courbot
2015-02-26  8:36   ` [Nouveau] [PATCH] gem: allow user-space to specify an object should be coherent Lucas Stach

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.