All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/nouveau: kill nouveau_ttm_fault_reserve_notify handler to prevent useless buffer moves
@ 2013-07-12 12:45 Maarten Lankhorst
       [not found] ` <51DFFA52.6010102-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Maarten Lankhorst @ 2013-07-12 12:45 UTC (permalink / raw)
  To: nouveau, dri-devel

I have no idea what this bogus restriction on placement is, but it breaks decoding 1080p
VDPAU at boot speed. With this patch applied I only need to bump the vdec clock to
get real-time 1080p decoding. It prevents a lot of VRAM <-> VRAM buffer moves.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index d506da5..86eb321 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -1339,34 +1339,6 @@ nouveau_ttm_io_mem_free(struct ttm_bo_device *bdev, struct ttm_mem_reg *mem)
 }
 
 static int
-nouveau_ttm_fault_reserve_notify(struct ttm_buffer_object *bo)
-{
-	struct nouveau_drm *drm = nouveau_bdev(bo->bdev);
-	struct nouveau_bo *nvbo = nouveau_bo(bo);
-	struct nouveau_device *device = nv_device(drm->device);
-	u32 mappable = pci_resource_len(device->pdev, 1) >> PAGE_SHIFT;
-
-	/* as long as the bo isn't in vram, and isn't tiled, we've got
-	 * nothing to do here.
-	 */
-	if (bo->mem.mem_type != TTM_PL_VRAM) {
-		if (nv_device(drm->device)->card_type < NV_50 ||
-		    !nouveau_bo_tile_layout(nvbo))
-			return 0;
-	}
-
-	/* make sure bo is in mappable vram */
-	if (bo->mem.start + bo->mem.num_pages < mappable)
-		return 0;
-
-
-	nvbo->placement.fpfn = 0;
-	nvbo->placement.lpfn = mappable;
-	nouveau_bo_placement_set(nvbo, TTM_PL_FLAG_VRAM, 0);
-	return nouveau_bo_validate(nvbo, false, false);
-}
-
-static int
 nouveau_ttm_tt_populate(struct ttm_tt *ttm)
 {
 	struct ttm_dma_tt *ttm_dma = (void *)ttm;
@@ -1524,7 +1496,6 @@ struct ttm_bo_driver nouveau_bo_driver = {
 	.sync_obj_flush = nouveau_bo_fence_flush,
 	.sync_obj_unref = nouveau_bo_fence_unref,
 	.sync_obj_ref = nouveau_bo_fence_ref,
-	.fault_reserve_notify = &nouveau_ttm_fault_reserve_notify,
 	.io_mem_reserve = &nouveau_ttm_io_mem_reserve,
 	.io_mem_free = &nouveau_ttm_io_mem_free,
 };

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/nouveau: kill nouveau_ttm_fault_reserve_notify handler to prevent useless buffer moves
       [not found] ` <51DFFA52.6010102-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
@ 2013-07-15  6:05   ` Ben Skeggs
       [not found]     ` <CACAvsv74X7xU0BkfhN0gXHvFG+Ooe=7wrFBzKnqUF7PMeB5wLw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Ben Skeggs @ 2013-07-15  6:05 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Fri, Jul 12, 2013 at 10:45 PM, Maarten Lankhorst
<maarten.lankhorst-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> wrote:
> I have no idea what this bogus restriction on placement is, but it breaks decoding 1080p
> VDPAU at boot speed. With this patch applied I only need to bump the vdec clock to
> get real-time 1080p decoding. It prevents a lot of VRAM <-> VRAM buffer moves.
It's not bogus, and is required for pre-GF8 boards with VRAM > BAR size.

What configuration does the buffer that's getting moved here have
exactly?  The placement restriction isn't necessary on GF8, the rest
of the restrictions may currently be required still however.

Ben.


>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> ---
> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
> index d506da5..86eb321 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
> @@ -1339,34 +1339,6 @@ nouveau_ttm_io_mem_free(struct ttm_bo_device *bdev, struct ttm_mem_reg *mem)
>  }
>
>  static int
> -nouveau_ttm_fault_reserve_notify(struct ttm_buffer_object *bo)
> -{
> -       struct nouveau_drm *drm = nouveau_bdev(bo->bdev);
> -       struct nouveau_bo *nvbo = nouveau_bo(bo);
> -       struct nouveau_device *device = nv_device(drm->device);
> -       u32 mappable = pci_resource_len(device->pdev, 1) >> PAGE_SHIFT;
> -
> -       /* as long as the bo isn't in vram, and isn't tiled, we've got
> -        * nothing to do here.
> -        */
> -       if (bo->mem.mem_type != TTM_PL_VRAM) {
> -               if (nv_device(drm->device)->card_type < NV_50 ||
> -                   !nouveau_bo_tile_layout(nvbo))
> -                       return 0;
> -       }
> -
> -       /* make sure bo is in mappable vram */
> -       if (bo->mem.start + bo->mem.num_pages < mappable)
> -               return 0;
> -
> -
> -       nvbo->placement.fpfn = 0;
> -       nvbo->placement.lpfn = mappable;
> -       nouveau_bo_placement_set(nvbo, TTM_PL_FLAG_VRAM, 0);
> -       return nouveau_bo_validate(nvbo, false, false);
> -}
> -
> -static int
>  nouveau_ttm_tt_populate(struct ttm_tt *ttm)
>  {
>         struct ttm_dma_tt *ttm_dma = (void *)ttm;
> @@ -1524,7 +1496,6 @@ struct ttm_bo_driver nouveau_bo_driver = {
>         .sync_obj_flush = nouveau_bo_fence_flush,
>         .sync_obj_unref = nouveau_bo_fence_unref,
>         .sync_obj_ref = nouveau_bo_fence_ref,
> -       .fault_reserve_notify = &nouveau_ttm_fault_reserve_notify,
>         .io_mem_reserve = &nouveau_ttm_io_mem_reserve,
>         .io_mem_free = &nouveau_ttm_io_mem_free,
>  };
>
> _______________________________________________
> dri-devel mailing list
> dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] drm/nouveau: do not move buffers when not needed
       [not found]     ` <CACAvsv74X7xU0BkfhN0gXHvFG+Ooe=7wrFBzKnqUF7PMeB5wLw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2013-07-15  8:39       ` Maarten Lankhorst
  2013-08-24  6:26         ` [Nouveau] " Martin Peres
       [not found]         ` <51E3B55D.8080403-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
  0 siblings, 2 replies; 7+ messages in thread
From: Maarten Lankhorst @ 2013-07-15  8:39 UTC (permalink / raw)
  To: Ben Skeggs
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

Op 15-07-13 08:05, Ben Skeggs schreef:
> On Fri, Jul 12, 2013 at 10:45 PM, Maarten Lankhorst
> <maarten.lankhorst-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> wrote:
>> I have no idea what this bogus restriction on placement is, but it breaks decoding 1080p
>> VDPAU at boot speed. With this patch applied I only need to bump the vdec clock to
>> get real-time 1080p decoding. It prevents a lot of VRAM <-> VRAM buffer moves.
> It's not bogus, and is required for pre-GF8 boards with VRAM > BAR size.
>
> What configuration does the buffer that's getting moved here have
> exactly?  The placement restriction isn't necessary on GF8, the rest
> of the restrictions may currently be required still however.
>
>= vdpau on NVC0 with tiling. I upload the raw bitstream to a tiling bo. This is ok because
the vm hides all the tiling translations, and the engines will read the raw bitstream correctly.
8<---
This prevents buffer moves from being done on NV50+, where remapping is not needed because
the bar has its own VM, instead of only having the first BAR1-size chunk of VRAM accessible.
nouveau_bo_tile_layout is always 0 on < NV_50.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
---
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index d506da5..762bfcd 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -1346,14 +1361,13 @@ nouveau_ttm_fault_reserve_notify(struct ttm_buffer_object *bo)
 	struct nouveau_device *device = nv_device(drm->device);
 	u32 mappable = pci_resource_len(device->pdev, 1) >> PAGE_SHIFT;
 
-	/* as long as the bo isn't in vram, and isn't tiled, we've got
-	 * nothing to do here.
+	/*
+	 * if the bo is not in vram, or remapping can be done (nv50+)
+	 * do not worry about placement, any location is valid
 	 */
-	if (bo->mem.mem_type != TTM_PL_VRAM) {
-		if (nv_device(drm->device)->card_type < NV_50 ||
-		    !nouveau_bo_tile_layout(nvbo))
-			return 0;
-	}
+	if (nv_device(drm->device)->card_type >= NV_50 ||
+	    bo->mem.mem_type != TTM_PL_VRAM)
+		return 0;
 
 	/* make sure bo is in mappable vram */
 	if (bo->mem.start + bo->mem.num_pages < mappable)

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Nouveau] [PATCH] drm/nouveau: do not move buffers when not needed
  2013-07-15  8:39       ` [PATCH] drm/nouveau: do not move buffers when not needed Maarten Lankhorst
@ 2013-08-24  6:26         ` Martin Peres
       [not found]         ` <51E3B55D.8080403-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
  1 sibling, 0 replies; 7+ messages in thread
From: Martin Peres @ 2013-08-24  6:26 UTC (permalink / raw)
  To: Maarten Lankhorst; +Cc: nouveau, dri-devel

On 15/07/2013 10:39, Maarten Lankhorst wrote:
> Op 15-07-13 08:05, Ben Skeggs schreef:
>> On Fri, Jul 12, 2013 at 10:45 PM, Maarten Lankhorst
>> <maarten.lankhorst@canonical.com> wrote:
>>> I have no idea what this bogus restriction on placement is, but it breaks decoding 1080p
>>> VDPAU at boot speed. With this patch applied I only need to bump the vdec clock to
>>> get real-time 1080p decoding. It prevents a lot of VRAM <-> VRAM buffer moves.
>> It's not bogus, and is required for pre-GF8 boards with VRAM > BAR size.
>>
>> What configuration does the buffer that's getting moved here have
>> exactly?  The placement restriction isn't necessary on GF8, the rest
>> of the restrictions may currently be required still however.
>>
>> = vdpau on NVC0 with tiling. I upload the raw bitstream to a tiling bo. This is ok because
> the vm hides all the tiling translations, and the engines will read the raw bitstream correctly.
> 8<---
> This prevents buffer moves from being done on NV50+, where remapping is not needed because
> the bar has its own VM, instead of only having the first BAR1-size chunk of VRAM accessible.
> nouveau_bo_tile_layout is always 0 on < NV_50.
>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
There are still some rendering issues on my nvc4, but the framerate is 
much smoother than it was before this patch.

Tested-by: Martin Peres <martin.peres@labri.fr>
> ---
> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
> index d506da5..762bfcd 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
> @@ -1346,14 +1361,13 @@ nouveau_ttm_fault_reserve_notify(struct ttm_buffer_object *bo)
>   	struct nouveau_device *device = nv_device(drm->device);
>   	u32 mappable = pci_resource_len(device->pdev, 1) >> PAGE_SHIFT;
>   
> -	/* as long as the bo isn't in vram, and isn't tiled, we've got
> -	 * nothing to do here.
> +	/*
> +	 * if the bo is not in vram, or remapping can be done (nv50+)
> +	 * do not worry about placement, any location is valid
>   	 */
> -	if (bo->mem.mem_type != TTM_PL_VRAM) {
> -		if (nv_device(drm->device)->card_type < NV_50 ||
> -		    !nouveau_bo_tile_layout(nvbo))
> -			return 0;
> -	}
> +	if (nv_device(drm->device)->card_type >= NV_50 ||
> +	    bo->mem.mem_type != TTM_PL_VRAM)
> +		return 0;
>   
>   	/* make sure bo is in mappable vram */
>   	if (bo->mem.start + bo->mem.num_pages < mappable)
>
> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/nouveau: do not move buffers when not needed
       [not found]         ` <51E3B55D.8080403-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
@ 2013-09-04  1:24           ` Ben Skeggs
  2013-09-04 13:25             ` Maarten Lankhorst
  0 siblings, 1 reply; 7+ messages in thread
From: Ben Skeggs @ 2013-09-04  1:24 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

On Mon, Jul 15, 2013 at 6:39 PM, Maarten Lankhorst
<maarten.lankhorst-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> wrote:
> Op 15-07-13 08:05, Ben Skeggs schreef:
>> On Fri, Jul 12, 2013 at 10:45 PM, Maarten Lankhorst
>> <maarten.lankhorst-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> wrote:
>>> I have no idea what this bogus restriction on placement is, but it breaks decoding 1080p
>>> VDPAU at boot speed. With this patch applied I only need to bump the vdec clock to
>>> get real-time 1080p decoding. It prevents a lot of VRAM <-> VRAM buffer moves.
>> It's not bogus, and is required for pre-GF8 boards with VRAM > BAR size.
>>
>> What configuration does the buffer that's getting moved here have
>> exactly?  The placement restriction isn't necessary on GF8, the rest
>> of the restrictions may currently be required still however.
>>
>>= vdpau on NVC0 with tiling. I upload the raw bitstream to a tiling bo. This is ok because
> the vm hides all the tiling translations, and the engines will read the raw bitstream correctly.
Why would you be doing such a thing in the first place?  It seems
pointless, and quite possibly counter-productive to use a tiled layout
for a linear data structure...

> 8<---
> This prevents buffer moves from being done on NV50+, where remapping is not needed because
> the bar has its own VM, instead of only having the first BAR1-size chunk of VRAM accessible.
> nouveau_bo_tile_layout is always 0 on < NV_50.
>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> ---
> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
> index d506da5..762bfcd 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
> @@ -1346,14 +1361,13 @@ nouveau_ttm_fault_reserve_notify(struct ttm_buffer_object *bo)
>         struct nouveau_device *device = nv_device(drm->device);
>         u32 mappable = pci_resource_len(device->pdev, 1) >> PAGE_SHIFT;
>
> -       /* as long as the bo isn't in vram, and isn't tiled, we've got
> -        * nothing to do here.
> +       /*
> +        * if the bo is not in vram, or remapping can be done (nv50+)
> +        * do not worry about placement, any location is valid
>          */
> -       if (bo->mem.mem_type != TTM_PL_VRAM) {
> -               if (nv_device(drm->device)->card_type < NV_50 ||
> -                   !nouveau_bo_tile_layout(nvbo))
> -                       return 0;
> -       }
> +       if (nv_device(drm->device)->card_type >= NV_50 ||
> +           bo->mem.mem_type != TTM_PL_VRAM)
> +               return 0;
I get what you're trying to do here, and we should definitely avoid
the "mappable vram" check on GF8, but I suspect this condition is too
broad.  I'll think about it more after I finish reviewing the rest of
the patches on the list..

Thanks,
Ben.
>
>         /* make sure bo is in mappable vram */
>         if (bo->mem.start + bo->mem.num_pages < mappable)
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/nouveau: do not move buffers when not needed
  2013-09-04  1:24           ` Ben Skeggs
@ 2013-09-04 13:25             ` Maarten Lankhorst
  2013-09-10  8:14               ` Ben Skeggs
  0 siblings, 1 reply; 7+ messages in thread
From: Maarten Lankhorst @ 2013-09-04 13:25 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau, dri-devel

Op 04-09-13 03:24, Ben Skeggs schreef:
> On Mon, Jul 15, 2013 at 6:39 PM, Maarten Lankhorst
> <maarten.lankhorst@canonical.com> wrote:
>> Op 15-07-13 08:05, Ben Skeggs schreef:
>>> On Fri, Jul 12, 2013 at 10:45 PM, Maarten Lankhorst
>>> <maarten.lankhorst@canonical.com> wrote:
>>>> I have no idea what this bogus restriction on placement is, but it breaks decoding 1080p
>>>> VDPAU at boot speed. With this patch applied I only need to bump the vdec clock to
>>>> get real-time 1080p decoding. It prevents a lot of VRAM <-> VRAM buffer moves.
>>> It's not bogus, and is required for pre-GF8 boards with VRAM > BAR size.
>>>
>>> What configuration does the buffer that's getting moved here have
>>> exactly?  The placement restriction isn't necessary on GF8, the rest
>>> of the restrictions may currently be required still however.
>>>
>>> = vdpau on NVC0 with tiling. I upload the raw bitstream to a tiling bo. This is ok because
>> the vm hides all the tiling translations, and the engines will read the raw bitstream correctly.
> Why would you be doing such a thing in the first place?  It seems
> pointless, and quite possibly counter-productive to use a tiled layout
> for a linear data structure...
Initially I just allocated everything I didn't need to access directly tiled, and it seems I did the same for
the bitstream bo. I only found out later about the bug with excessive moves causing a major slowdown.

>> 8<---
>> This prevents buffer moves from being done on NV50+, where remapping is not needed because
>> the bar has its own VM, instead of only having the first BAR1-size chunk of VRAM accessible.
>> nouveau_bo_tile_layout is always 0 on < NV_50.
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>> ---
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
>> index d506da5..762bfcd 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
>> @@ -1346,14 +1361,13 @@ nouveau_ttm_fault_reserve_notify(struct ttm_buffer_object *bo)
>>         struct nouveau_device *device = nv_device(drm->device);
>>         u32 mappable = pci_resource_len(device->pdev, 1) >> PAGE_SHIFT;
>>
>> -       /* as long as the bo isn't in vram, and isn't tiled, we've got
>> -        * nothing to do here.
>> +       /*
>> +        * if the bo is not in vram, or remapping can be done (nv50+)
>> +        * do not worry about placement, any location is valid
>>          */
>> -       if (bo->mem.mem_type != TTM_PL_VRAM) {
>> -               if (nv_device(drm->device)->card_type < NV_50 ||
>> -                   !nouveau_bo_tile_layout(nvbo))
>> -                       return 0;
>> -       }
>> +       if (nv_device(drm->device)->card_type >= NV_50 ||
>> +           bo->mem.mem_type != TTM_PL_VRAM)
>> +               return 0;
> I get what you're trying to do here, and we should definitely avoid
> the "mappable vram" check on GF8, but I suspect this condition is too
> broad.  I'll think about it more after I finish reviewing the rest of
> the patches on the list..
>
I think this relaxed check is fine. If it's !VRAM, the host can always access it because it has direct access to the
pages without needing anything from the gpu. On >= NV50 the move can always be skipped too because the
memory is mapped to the vm, and always accessible.

~Maarten

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drm/nouveau: do not move buffers when not needed
  2013-09-04 13:25             ` Maarten Lankhorst
@ 2013-09-10  8:14               ` Ben Skeggs
  0 siblings, 0 replies; 7+ messages in thread
From: Ben Skeggs @ 2013-09-10  8:14 UTC (permalink / raw)
  To: Maarten Lankhorst; +Cc: nouveau, dri-devel

On Wed, Sep 4, 2013 at 11:25 PM, Maarten Lankhorst
<maarten.lankhorst@canonical.com> wrote:
> Op 04-09-13 03:24, Ben Skeggs schreef:
>> On Mon, Jul 15, 2013 at 6:39 PM, Maarten Lankhorst
>> <maarten.lankhorst@canonical.com> wrote:
>>> Op 15-07-13 08:05, Ben Skeggs schreef:
>>>> On Fri, Jul 12, 2013 at 10:45 PM, Maarten Lankhorst
>>>> <maarten.lankhorst@canonical.com> wrote:
>>>>> I have no idea what this bogus restriction on placement is, but it breaks decoding 1080p
>>>>> VDPAU at boot speed. With this patch applied I only need to bump the vdec clock to
>>>>> get real-time 1080p decoding. It prevents a lot of VRAM <-> VRAM buffer moves.
>>>> It's not bogus, and is required for pre-GF8 boards with VRAM > BAR size.
>>>>
>>>> What configuration does the buffer that's getting moved here have
>>>> exactly?  The placement restriction isn't necessary on GF8, the rest
>>>> of the restrictions may currently be required still however.
>>>>
>>>> = vdpau on NVC0 with tiling. I upload the raw bitstream to a tiling bo. This is ok because
>>> the vm hides all the tiling translations, and the engines will read the raw bitstream correctly.
>> Why would you be doing such a thing in the first place?  It seems
>> pointless, and quite possibly counter-productive to use a tiled layout
>> for a linear data structure...
> Initially I just allocated everything I didn't need to access directly tiled, and it seems I did the same for
> the bitstream bo. I only found out later about the bug with excessive moves causing a major slowdown.
>
>>> 8<---
>>> This prevents buffer moves from being done on NV50+, where remapping is not needed because
>>> the bar has its own VM, instead of only having the first BAR1-size chunk of VRAM accessible.
>>> nouveau_bo_tile_layout is always 0 on < NV_50.
>>>
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>> ---
>>> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
>>> index d506da5..762bfcd 100644
>>> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
>>> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
>>> @@ -1346,14 +1361,13 @@ nouveau_ttm_fault_reserve_notify(struct ttm_buffer_object *bo)
>>>         struct nouveau_device *device = nv_device(drm->device);
>>>         u32 mappable = pci_resource_len(device->pdev, 1) >> PAGE_SHIFT;
>>>
>>> -       /* as long as the bo isn't in vram, and isn't tiled, we've got
>>> -        * nothing to do here.
>>> +       /*
>>> +        * if the bo is not in vram, or remapping can be done (nv50+)
>>> +        * do not worry about placement, any location is valid
>>>          */
>>> -       if (bo->mem.mem_type != TTM_PL_VRAM) {
>>> -               if (nv_device(drm->device)->card_type < NV_50 ||
>>> -                   !nouveau_bo_tile_layout(nvbo))
>>> -                       return 0;
>>> -       }
>>> +       if (nv_device(drm->device)->card_type >= NV_50 ||
>>> +           bo->mem.mem_type != TTM_PL_VRAM)
>>> +               return 0;
>> I get what you're trying to do here, and we should definitely avoid
>> the "mappable vram" check on GF8, but I suspect this condition is too
>> broad.  I'll think about it more after I finish reviewing the rest of
>> the patches on the list..
>>
> I think this relaxed check is fine. If it's !VRAM, the host can always access it because it has direct access to the
> pages without needing anything from the gpu. On >= NV50 the move can always be skipped too because the
> memory is mapped to the vm, and always accessible.
Yeah, I think the check is sane, after thinking it through.  We shall
find out :)

But first, if you're going to not move tiled system memory to vram
first, you're going to need to deal with mapping it into bar1 so the
vm deals with some of the reordering.  See what's done in
io_mem_reserve() for TTM_PL_VRAM.

Ben.

>
> ~Maarten
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-09-10  8:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-12 12:45 [PATCH] drm/nouveau: kill nouveau_ttm_fault_reserve_notify handler to prevent useless buffer moves Maarten Lankhorst
     [not found] ` <51DFFA52.6010102-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
2013-07-15  6:05   ` Ben Skeggs
     [not found]     ` <CACAvsv74X7xU0BkfhN0gXHvFG+Ooe=7wrFBzKnqUF7PMeB5wLw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2013-07-15  8:39       ` [PATCH] drm/nouveau: do not move buffers when not needed Maarten Lankhorst
2013-08-24  6:26         ` [Nouveau] " Martin Peres
     [not found]         ` <51E3B55D.8080403-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
2013-09-04  1:24           ` Ben Skeggs
2013-09-04 13:25             ` Maarten Lankhorst
2013-09-10  8:14               ` Ben Skeggs

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.