All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
@ 2009-02-12 10:15 David Miller
  2009-02-12 10:35 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2009-02-12 10:15 UTC (permalink / raw)
  To: airlied; +Cc: benh, dri-devel, linux-kernel


The PCI GART table initialization code treats the GART table mapping
unconditionally as a kernel virtual address.

But it could be in the framebuffer, for example, and thus we're
dealing with a PCI MEM space ioremap() cookie.  Treating that as a
virtual address is illegal and will crash some system types (such as
sparc64 where the ioremap() return value is actually a physical I/O
address).

So access the area correctly, using gart_info->gart_table_location as
our guide.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/gpu/drm/ati_pcigart.c |   26 ++++++++++++++++++++------
 1 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/ati_pcigart.c b/drivers/gpu/drm/ati_pcigart.c
index c533d0c..2cd827a 100644
--- a/drivers/gpu/drm/ati_pcigart.c
+++ b/drivers/gpu/drm/ati_pcigart.c
@@ -95,10 +95,11 @@ EXPORT_SYMBOL(drm_ati_pcigart_cleanup);
 
 int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *gart_info)
 {
+	struct drm_local_map *map = &gart_info->mapping;
 	struct drm_sg_mem *entry = dev->sg;
 	void *address = NULL;
 	unsigned long pages;
-	u32 *pci_gart, page_base;
+	u32 *pci_gart, page_base, gart_idx;
 	dma_addr_t bus_address = 0;
 	int i, j, ret = 0;
 	int max_pages;
@@ -133,8 +134,14 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga
 	pages = (entry->pages <= max_pages)
 	    ? entry->pages : max_pages;
 
-	memset(pci_gart, 0, max_pages * sizeof(u32));
+	if (gart_info->gart_table_location == DRM_ATI_GART_MAIN) {
+		memset(pci_gart, 0, max_pages * sizeof(u32));
+	} else {
+		for (gart_idx = 0; gart_idx < max_pages; gart_idx++)
+			DRM_WRITE32(map, gart_idx * sizeof(u32), 0);
+	}
 
+	gart_idx = 0;
 	for (i = 0; i < pages; i++) {
 		/* we need to support large memory configurations */
 		entry->busaddr[i] = pci_map_page(dev->pdev, entry->pagelist[i],
@@ -149,19 +156,26 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga
 		page_base = (u32) entry->busaddr[i];
 
 		for (j = 0; j < (PAGE_SIZE / ATI_PCIGART_PAGE_SIZE); j++) {
+			u32 val;
+
 			switch(gart_info->gart_reg_if) {
 			case DRM_ATI_GART_IGP:
-				*pci_gart = cpu_to_le32((page_base) | 0xc);
+				val = page_base | 0xc;
 				break;
 			case DRM_ATI_GART_PCIE:
-				*pci_gart = cpu_to_le32((page_base >> 8) | 0xc);
+				val = (page_base >> 8) | 0xc;
 				break;
 			default:
 			case DRM_ATI_GART_PCI:
-				*pci_gart = cpu_to_le32(page_base);
+				val = page_base;
 				break;
 			}
-			pci_gart++;
+			if (gart_info->gart_table_location ==
+			    DRM_ATI_GART_MAIN)
+				pci_gart[gart_idx] = cpu_to_le32(val);
+			else
+				DRM_WRITE32(map, gart_idx * sizeof(u32), val);
+			gart_idx++;
 			page_base += ATI_PCIGART_PAGE_SIZE;
 		}
 	}
-- 
1.6.1.2.350.g88cc


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
  2009-02-12 10:15 [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs David Miller
@ 2009-02-12 10:35 ` Benjamin Herrenschmidt
  2009-02-12 11:09   ` David Miller
  2009-02-14  6:09   ` David Miller
  0 siblings, 2 replies; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2009-02-12 10:35 UTC (permalink / raw)
  To: David Miller; +Cc: airlied, dri-devel, linux-kernel

On Thu, 2009-02-12 at 02:15 -0800, David Miller wrote:
> The PCI GART table initialization code treats the GART table mapping
> unconditionally as a kernel virtual address.
> 
> But it could be in the framebuffer, for example, and thus we're
> dealing with a PCI MEM space ioremap() cookie.  Treating that as a
> virtual address is illegal and will crash some system types (such as
> sparc64 where the ioremap() return value is actually a physical I/O
> address).
> 
> So access the area correctly, using gart_info->gart_table_location as
> our guide.

Oh BTW something else to be careful with, though I suppose it's working
some what by accident right now... when the GART is in the frame buffer
it gets applied the current fb swapper setting... ouch !

So it might be a good idea, if we're going to use DRM_READ/WRITE32 which
afaik are readl/writel (ie, swapping) to make sure we at least
temporarily disable that swapper while whacking the GART.

I know David (Airlied) kms stuff sorts that out by having a fixed
surface covering that area but definitely worth keeping in mind.

Cheers,
Ben.

> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
>  drivers/gpu/drm/ati_pcigart.c |   26 ++++++++++++++++++++------
>  1 files changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ati_pcigart.c b/drivers/gpu/drm/ati_pcigart.c
> index c533d0c..2cd827a 100644
> --- a/drivers/gpu/drm/ati_pcigart.c
> +++ b/drivers/gpu/drm/ati_pcigart.c
> @@ -95,10 +95,11 @@ EXPORT_SYMBOL(drm_ati_pcigart_cleanup);
>  
>  int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *gart_info)
>  {
> +	struct drm_local_map *map = &gart_info->mapping;
>  	struct drm_sg_mem *entry = dev->sg;
>  	void *address = NULL;
>  	unsigned long pages;
> -	u32 *pci_gart, page_base;
> +	u32 *pci_gart, page_base, gart_idx;
>  	dma_addr_t bus_address = 0;
>  	int i, j, ret = 0;
>  	int max_pages;
> @@ -133,8 +134,14 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga
>  	pages = (entry->pages <= max_pages)
>  	    ? entry->pages : max_pages;
>  
> -	memset(pci_gart, 0, max_pages * sizeof(u32));
> +	if (gart_info->gart_table_location == DRM_ATI_GART_MAIN) {
> +		memset(pci_gart, 0, max_pages * sizeof(u32));
> +	} else {
> +		for (gart_idx = 0; gart_idx < max_pages; gart_idx++)
> +			DRM_WRITE32(map, gart_idx * sizeof(u32), 0);
> +	}
>  
> +	gart_idx = 0;
>  	for (i = 0; i < pages; i++) {
>  		/* we need to support large memory configurations */
>  		entry->busaddr[i] = pci_map_page(dev->pdev, entry->pagelist[i],
> @@ -149,19 +156,26 @@ int drm_ati_pcigart_init(struct drm_device *dev, struct drm_ati_pcigart_info *ga
>  		page_base = (u32) entry->busaddr[i];
>  
>  		for (j = 0; j < (PAGE_SIZE / ATI_PCIGART_PAGE_SIZE); j++) {
> +			u32 val;
> +
>  			switch(gart_info->gart_reg_if) {
>  			case DRM_ATI_GART_IGP:
> -				*pci_gart = cpu_to_le32((page_base) | 0xc);
> +				val = page_base | 0xc;
>  				break;
>  			case DRM_ATI_GART_PCIE:
> -				*pci_gart = cpu_to_le32((page_base >> 8) | 0xc);
> +				val = (page_base >> 8) | 0xc;
>  				break;
>  			default:
>  			case DRM_ATI_GART_PCI:
> -				*pci_gart = cpu_to_le32(page_base);
> +				val = page_base;
>  				break;
>  			}
> -			pci_gart++;
> +			if (gart_info->gart_table_location ==
> +			    DRM_ATI_GART_MAIN)
> +				pci_gart[gart_idx] = cpu_to_le32(val);
> +			else
> +				DRM_WRITE32(map, gart_idx * sizeof(u32), val);
> +			gart_idx++;
>  			page_base += ATI_PCIGART_PAGE_SIZE;
>  		}
>  	}


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
  2009-02-12 10:35 ` Benjamin Herrenschmidt
@ 2009-02-12 11:09   ` David Miller
  2009-02-12 11:23     ` Dave Airlie
  2009-02-12 21:26     ` Benjamin Herrenschmidt
  2009-02-14  6:09   ` David Miller
  1 sibling, 2 replies; 14+ messages in thread
From: David Miller @ 2009-02-12 11:09 UTC (permalink / raw)
  To: benh; +Cc: airlied, dri-devel, linux-kernel

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Thu, 12 Feb 2009 21:35:59 +1100

> Oh BTW something else to be careful with, though I suppose it's working
> some what by accident right now... when the GART is in the frame buffer
> it gets applied the current fb swapper setting... ouch !
> 
> So it might be a good idea, if we're going to use DRM_READ/WRITE32 which
> afaik are readl/writel (ie, swapping) to make sure we at least
> temporarily disable that swapper while whacking the GART.

Cute.

I wonder if this is what is tripping me up.

But, looking more closely, it appears that:

1) The kernel radeon framebuffer driver doesn't mess with
   the framebuffer endianness setting.

2) On >= R300 (which my chip is), Xorg leaves it alone too.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using  pointer derefs.
  2009-02-12 11:09   ` David Miller
@ 2009-02-12 11:23     ` Dave Airlie
  2009-02-12 22:17       ` David Miller
  2009-02-12 21:26     ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 14+ messages in thread
From: Dave Airlie @ 2009-02-12 11:23 UTC (permalink / raw)
  To: David Miller; +Cc: benh, airlied, dri-devel, linux-kernel

On Thu, Feb 12, 2009 at 9:09 PM, David Miller <davem@davemloft.net> wrote:
> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Thu, 12 Feb 2009 21:35:59 +1100
>
>> Oh BTW something else to be careful with, though I suppose it's working
>> some what by accident right now... when the GART is in the frame buffer
>> it gets applied the current fb swapper setting... ouch !
>>
>> So it might be a good idea, if we're going to use DRM_READ/WRITE32 which
>> afaik are readl/writel (ie, swapping) to make sure we at least
>> temporarily disable that swapper while whacking the GART.
>
> Cute.

Dave,

are you on a PCI or PCIE card, I've no idea what buses you have on sparc64.

On the PCI cards the GART table will always be in main memory.
PCIE always in VRAM.

Dave.

>
> I wonder if this is what is tripping me up.
>
> But, looking more closely, it appears that:
>
> 1) The kernel radeon framebuffer driver doesn't mess with
>   the framebuffer endianness setting.
>
> 2) On >= R300 (which my chip is), Xorg leaves it alone too.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
  2009-02-12 11:09   ` David Miller
  2009-02-12 11:23     ` Dave Airlie
@ 2009-02-12 21:26     ` Benjamin Herrenschmidt
  2009-02-12 21:29       ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2009-02-12 21:26 UTC (permalink / raw)
  To: David Miller; +Cc: airlied, dri-devel, linux-kernel


> 1) The kernel radeon framebuffer driver doesn't mess with
>    the framebuffer endianness setting.
> 
> 2) On >= R300 (which my chip is), Xorg leaves it alone too.

They leave alone the swapper of the engine, not the fb one
(SURFACE_CNTL) afaik.

I dbl checked the other day and it seems that we setup the GART before
we whack it tho but it might be an issue in conjunction with radeonfb
(ie, I don't have radeonfb on my test embedded boards as I rely on X to
POST the cards using ATOMBIOS).

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
  2009-02-12 21:26     ` Benjamin Herrenschmidt
@ 2009-02-12 21:29       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2009-02-12 21:29 UTC (permalink / raw)
  To: David Miller; +Cc: airlied, dri-devel, linux-kernel

On Fri, 2009-02-13 at 08:26 +1100, Benjamin Herrenschmidt wrote:
> > 1) The kernel radeon framebuffer driver doesn't mess with
> >    the framebuffer endianness setting.
> > 
> > 2) On >= R300 (which my chip is), Xorg leaves it alone too.
> 
> They leave alone the swapper of the engine, not the fb one
> (SURFACE_CNTL) afaik.

Though actually if you stick to 8bpp, the default, the swapper will be
off in radeonfb.

> I dbl checked the other day and it seems that we setup the GART before
> we whack it tho but it might be an issue in conjunction with radeonfb
> (ie, I don't have radeonfb on my test embedded boards as I rely on X to
> POST the cards using ATOMBIOS).
> 
> Cheers,
> Ben.
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
  2009-02-12 11:23     ` Dave Airlie
@ 2009-02-12 22:17       ` David Miller
  0 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2009-02-12 22:17 UTC (permalink / raw)
  To: airlied; +Cc: benh, airlied, dri-devel, linux-kernel

From: Dave Airlie <airlied@gmail.com>
Date: Thu, 12 Feb 2009 21:23:13 +1000

> are you on a PCI or PCIE card, I've no idea what buses you have on sparc64.
> 
> On the PCI cards the GART table will always be in main memory.
> PCIE always in VRAM.

PCI-E

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
  2009-02-12 10:35 ` Benjamin Herrenschmidt
  2009-02-12 11:09   ` David Miller
@ 2009-02-14  6:09   ` David Miller
  2009-02-14  7:42     ` Dave Airlie
  2009-02-14  9:07     ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 14+ messages in thread
From: David Miller @ 2009-02-14  6:09 UTC (permalink / raw)
  To: benh; +Cc: airlied, dri-devel, linux-kernel

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Thu, 12 Feb 2009 21:35:59 +1100

> Oh BTW something else to be careful with, though I suppose it's working
> some what by accident right now... when the GART is in the frame buffer
> it gets applied the current fb swapper setting... ouch !
> 
> So it might be a good idea, if we're going to use DRM_READ/WRITE32 which
> afaik are readl/writel (ie, swapping) to make sure we at least
> temporarily disable that swapper while whacking the GART.

So I've narrowed down the final problems I'm having.  It's the surface
swapping settings indeed.

Running Xorg at depth 8, the CP can execute indirect buffers just
fine, no code changes necessary.

However, the CP hangs on indirect execution for depth 16 and 24.  But
these depths work if I hard disable the surface byte swapping settings
in the radeon Xorg driver.

I did some research, and it does appear that the GART does read the
PTEs from the VRAM using the Host Data Path.  This means the surface
control byte swapping settings are applied.

So for depths of 16 and 24, the GART is reading garbage PTEs.  And
that's why the CP hangs.

I have no idea how to handle this properly.  Not only does the Xorg
ATI driver set the swapping settings at an arbitray point, it also
mucks with them dynamically f.e. in the render helper functions.

The only way this can work properly is to choose one surface swapping
setting, set the VRAM GART table so that the GART sees the PTEs in the
correct format, and retain those settings throughout.

It seems like, in at least R5xx, they tried to add a control bit in
the PCI-E GART registers to handle this.  Bit 6 in PCI-E indirect
register 0x10 is named 'GART_RDREQPATH_SEL' and has description:

	Read request path
	 0=HDP
	 1=Direct Rd Req to MC

This seems to be intended to be a way to have the GART PTE reads
bypass the full Host Data Path (and thus potential surface byte
swaps), by setting this bit to 1.

I tried doing that, but it doesn't help my RV370 so perhaps older
chips don't support this bit.  It's hard to tell as this is the area
where all of AMD's published GPU documents are severely lacking.

I tested whether reading back the PCIE_TX_GART_CNTL register shows
bit 6 after I try to write it, and it always reads back zero.

There are even more complications, the VT enter/exit code in the Xorg
ATI driver tries to save and restore the VRAM GART table for PCI-E
cards.  But this:

1) Mis-sizes the GART table save buffer, it uses PAGE_SIZE instead
   of the constant 4096 to determine how many GART entries there
   are.  The PCI GART entries map 4096 bytes, always.  So using
   getpagesize() is wrong.  (see RADEONDRIGetPciAperTableSize)

   I have this fixed in my local tree.

2) It doesn't check the surface byte swapping settings, so it
   could be saving in one byte order and restoing in another.

   I guess we could force RADEON_SURFACE_CNTL to zero around
   the two memcpy()'s done in radeon_driver.c

But really this whole area is a mess, and I know KMS is coming to fix
this, but even in the traditional XORG/DRM layout XORG has no business
keeping the GART table uptodate across VT switches.  It should be
calling into the DRM with an ioctl to write the GART table out to VRAM
again.

Finally there is a huge issue with how the Xorg ATI driver resets the
chip using the RBBM.  It soft resets the CP, but this resets the ring
read pointer.  However, nothing is done to make sure the DRM resync's
the ring write pointer (which remains unchanged by a soft CP reset) as
well as it's internal software ring state.

The result is that on the very next CP command submission, the CP
re-executes everything from ring entry zero until wherever the DRM
things the write pointer was at the time of the CP soft reset.

Any time the Xorg ATI driver does a CP soft reset, it should do
CP stop and resume calls to the DRM to resync the ring pointers.
And this does not appear to be happening where it needs to be
happening.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using  pointer derefs.
  2009-02-14  6:09   ` David Miller
@ 2009-02-14  7:42     ` Dave Airlie
  2009-02-14  8:58       ` David Miller
  2009-02-14  9:09       ` Benjamin Herrenschmidt
  2009-02-14  9:07     ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 14+ messages in thread
From: Dave Airlie @ 2009-02-14  7:42 UTC (permalink / raw)
  To: David Miller; +Cc: benh, airlied, dri-devel, linux-kernel

On Sat, Feb 14, 2009 at 4:09 PM, David Miller <davem@davemloft.net> wrote:
> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Thu, 12 Feb 2009 21:35:59 +1100
>
>> Oh BTW something else to be careful with, though I suppose it's working
>> some what by accident right now... when the GART is in the frame buffer
>> it gets applied the current fb swapper setting... ouch !
>>
>> So it might be a good idea, if we're going to use DRM_READ/WRITE32 which
>> afaik are readl/writel (ie, swapping) to make sure we at least
>> temporarily disable that swapper while whacking the GART.
>
> So I've narrowed down the final problems I'm having.  It's the surface
> swapping settings indeed.
>
> Running Xorg at depth 8, the CP can execute indirect buffers just
> fine, no code changes necessary.
>
> However, the CP hangs on indirect execution for depth 16 and 24.  But
> these depths work if I hard disable the surface byte swapping settings
> in the radeon Xorg driver.
>
> I did some research, and it does appear that the GART does read the
> PTEs from the VRAM using the Host Data Path.  This means the surface
> control byte swapping settings are applied.
>
> So for depths of 16 and 24, the GART is reading garbage PTEs.  And
> that's why the CP hangs.

Wow that is ugly, truly ugly. I'm going to say I've little idea how to
fix this outside KMS,

The only think I can think to do is use a surface instead of the
aperture swappers
and make the surface cover all the VRAM except the GART table which is
at the end.

> I have no idea how to handle this properly.  Not only does the Xorg
> ATI driver set the swapping settings at an arbitray point, it also
> mucks with them dynamically f.e. in the render helper functions.
>
> The only way this can work properly is to choose one surface swapping
> setting, set the VRAM GART table so that the GART sees the PTEs in the
> correct format, and retain those settings throughout
>
> It seems like, in at least R5xx, they tried to add a control bit in
> the PCI-E GART registers to handle this.  Bit 6 in PCI-E indirect
> register 0x10 is named 'GART_RDREQPATH_SEL' and has description:
>
>        Read request path
>         0=HDP
>         1=Direct Rd Req to MC
>
> This seems to be intended to be a way to have the GART PTE reads
> bypass the full Host Data Path (and thus potential surface byte
> swaps), by setting this bit to 1.
>
> I tried doing that, but it doesn't help my RV370 so perhaps older
> chips don't support this bit.  It's hard to tell as this is the area
> where all of AMD's published GPU documents are severely lacking.

You don't get this bit until rv515 and above so no pony for you.

>
> I tested whether reading back the PCIE_TX_GART_CNTL register shows
> bit 6 after I try to write it, and it always reads back zero.
>
> There are even more complications, the VT enter/exit code in the Xorg
> ATI driver tries to save and restore the VRAM GART table for PCI-E
> cards.  But this:
>
> 1) Mis-sizes the GART table save buffer, it uses PAGE_SIZE instead
>   of the constant 4096 to determine how many GART entries there
>   are.  The PCI GART entries map 4096 bytes, always.  So using
>   getpagesize() is wrong.  (see RADEONDRIGetPciAperTableSize)
>
>   I have this fixed in my local tree.

Oops.

>
> 2) It doesn't check the surface byte swapping settings, so it
>   could be saving in one byte order and restoing in another.
>
>   I guess we could force RADEON_SURFACE_CNTL to zero around
>   the two memcpy()'s done in radeon_driver.c

Might be a good plan.

>
> But really this whole area is a mess, and I know KMS is coming to fix
> this, but even in the traditional XORG/DRM layout XORG has no business
> keeping the GART table uptodate across VT switches.  It should be
> calling into the DRM with an ioctl to write the GART table out to VRAM
> again.

You can draw an arbitrary line anywhere you want really, its all an unholy mess,
ever since people decided userspace drivers were a good idea.

>
> Finally there is a huge issue with how the Xorg ATI driver resets the
> chip using the RBBM.  It soft resets the CP, but this resets the ring
> read pointer.  However, nothing is done to make sure the DRM resync's
> the ring write pointer (which remains unchanged by a soft CP reset) as
> well as it's internal software ring state.
>
> The result is that on the very next CP command submission, the CP
> re-executes everything from ring entry zero until wherever the DRM
> things the write pointer was at the time of the CP soft reset.
>
> Any time the Xorg ATI driver does a CP soft reset, it should do
> CP stop and resume calls to the DRM to resync the ring pointers.
> And this does not appear to be happening where it needs to be
> happening.

That's interesting, it could explain why things never work again after a reset
or at least proceed to hang straight away.

Dave.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
  2009-02-14  7:42     ` Dave Airlie
@ 2009-02-14  8:58       ` David Miller
  2009-02-14  9:09       ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 14+ messages in thread
From: David Miller @ 2009-02-14  8:58 UTC (permalink / raw)
  To: airlied; +Cc: benh, airlied, dri-devel, linux-kernel

From: Dave Airlie <airlied@gmail.com>
Date: Sat, 14 Feb 2009 17:42:02 +1000

> On Sat, Feb 14, 2009 at 4:09 PM, David Miller <davem@davemloft.net> wrote:
> > 1) Mis-sizes the GART table save buffer, it uses PAGE_SIZE instead
> >   of the constant 4096 to determine how many GART entries there
> >   are.  The PCI GART entries map 4096 bytes, always.  So using
> >   getpagesize() is wrong.  (see RADEONDRIGetPciAperTableSize)
> >
> >   I have this fixed in my local tree.
> 
> Oops.
> 
> > 2) It doesn't check the surface byte swapping settings, so it
> >   could be saving in one byte order and restoing in another.
> >
> >   I guess we could force RADEON_SURFACE_CNTL to zero around
> >   the two memcpy()'s done in radeon_driver.c
> 
> Might be a good plan.

I have patches for both of these things written, will submit
to the xorg-driver-ati list.

I also have a cunning plan to work around the surface swapping
GART issue in the DRM, will try that out right now.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
  2009-02-14  6:09   ` David Miller
  2009-02-14  7:42     ` Dave Airlie
@ 2009-02-14  9:07     ` Benjamin Herrenschmidt
  2009-02-14  9:11       ` David Miller
  1 sibling, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2009-02-14  9:07 UTC (permalink / raw)
  To: David Miller; +Cc: airlied, dri-devel, linux-kernel


> So I've narrowed down the final problems I'm having.  It's the surface
> swapping settings indeed.
> 
> Running Xorg at depth 8, the CP can execute indirect buffers just
> fine, no code changes necessary.
> 
> However, the CP hangs on indirect execution for depth 16 and 24.  But
> these depths work if I hard disable the surface byte swapping settings
> in the radeon Xorg driver.
> 
> I did some research, and it does appear that the GART does read the
> PTEs from the VRAM using the Host Data Path.  This means the surface
> control byte swapping settings are applied.
> 
> So for depths of 16 and 24, the GART is reading garbage PTEs.  And
> that's why the CP hangs.

That makes me wonder how the heck did it work for me ! Or maybe... I've
been using an R5xx which happens to have a bit that I haven't seen on
R3xx that allows ... to set whether the GART reads come from HDP or
directly from MC. That might be what saved my ass here.

> I have no idea how to handle this properly.  Not only does the Xorg
> ATI driver set the swapping settings at an arbitray point, it also
> mucks with them dynamically f.e. in the render helper functions.
> 
> The only way this can work properly is to choose one surface swapping
> setting, set the VRAM GART table so that the GART sees the PTEs in the
> correct format, and retain those settings throughout.

We can do that by registering a surface from the kernel to cover the
GART I suppose, and clean things a bit so that when using the DRI, X
doesn't touch the surface registers -at all- and leaves it to the
kernel.

> It seems like, in at least R5xx, they tried to add a control bit in
> the PCI-E GART registers to handle this.  Bit 6 in PCI-E indirect
> register 0x10 is named 'GART_RDREQPATH_SEL' and has description:
> 
> 	Read request path
> 	 0=HDP
> 	 1=Direct Rd Req to MC

Right. That's the one I was talking about earlier.

> This seems to be intended to be a way to have the GART PTE reads
> bypass the full Host Data Path (and thus potential surface byte
> swaps), by setting this bit to 1.
> 
> I tried doing that, but it doesn't help my RV370 so perhaps older
> chips don't support this bit.  It's hard to tell as this is the area
> where all of AMD's published GPU documents are severely lacking.

Yup.

> I tested whether reading back the PCIE_TX_GART_CNTL register shows
> bit 6 after I try to write it, and it always reads back zero.

Yes, it's likely that they added it after being bitten on Apple or
such :-)

 .../...

> 2) It doesn't check the surface byte swapping settings, so it
>    could be saving in one byte order and restoing in another.
> 
>    I guess we could force RADEON_SURFACE_CNTL to zero around
>    the two memcpy()'s done in radeon_driver.c

Either that or have the kernel do it... (ioctl's ?) which would be more
in the spirit of moving toward KMS and would avoid those blunders ...
That's my preferred approach.

But if we want to stick to the current mess, and we have a bolted
surface covering the GART, it would work fine as long as the restore
path ensures the DRM restores all surfaces before it loads it back into
vram, ie, same setting on save/restore, userspace doesn't have to care
what the swapper actually is.

> But really this whole area is a mess, and I know KMS is coming to fix
> this, but even in the traditional XORG/DRM layout XORG has no business
> keeping the GART table uptodate across VT switches.  It should be
> calling into the DRM with an ioctl to write the GART table out to VRAM
> again.

Preferably. Shouldn't even be hard in fact. Again, I haven't hit it on
my r500 tests because I let X POST the card and basically have nothing
on the VTs. I don't have a PCI-E R300 to test with, though there is
-one- model of iMac G5 with such a thing, and guess what ? It's been
troublesome for ages, though I never managed to get my hand on one to
actually debug it. It might just all be the same issues.

> Finally there is a huge issue with how the Xorg ATI driver resets the
> chip using the RBBM.  It soft resets the CP, but this resets the ring
> read pointer.  However, nothing is done to make sure the DRM resync's
> the ring write pointer (which remains unchanged by a soft CP reset) as
> well as it's internal software ring state.

Same as above, it should just not do it when the DRI's around. Just fire
an ioctl if necessary and let the DRM do it. But if it's going to do it,
bracketing it in CP stop/resume might do the trick.

> The result is that on the very next CP command submission, the CP
> re-executes everything from ring entry zero until wherever the DRM
> things the write pointer was at the time of the CP soft reset.
> 
> Any time the Xorg ATI driver does a CP soft reset, it should do
> CP stop and resume calls to the DRM to resync the ring pointers.
> And this does not appear to be happening where it needs to be
> happening.

Heh, looks like we're on the same mind set :-)

Ben.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using  pointer derefs.
  2009-02-14  7:42     ` Dave Airlie
  2009-02-14  8:58       ` David Miller
@ 2009-02-14  9:09       ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2009-02-14  9:09 UTC (permalink / raw)
  To: Dave Airlie; +Cc: David Miller, airlied, dri-devel, linux-kernel


> The only think I can think to do is use a surface instead of the
> aperture swappers
> and make the surface cover all the VRAM except the GART table which is
> at the end.

Why not use a surface to cover the GART ? At least this would ensure a
known swapper setting for it. 

> That's interesting, it could explain why things never work again after a reset
> or at least proceed to hang straight away.

No shit :-)

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
  2009-02-14  9:07     ` Benjamin Herrenschmidt
@ 2009-02-14  9:11       ` David Miller
  2009-02-14  9:51         ` David Miller
  0 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2009-02-14  9:11 UTC (permalink / raw)
  To: benh; +Cc: airlied, dri-devel, linux-kernel

From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Sat, 14 Feb 2009 20:07:54 +1100

> > I did some research, and it does appear that the GART does read the
> > PTEs from the VRAM using the Host Data Path.  This means the surface
> > control byte swapping settings are applied.
> > 
> > So for depths of 16 and 24, the GART is reading garbage PTEs.  And
> > that's why the CP hangs.
> 
> That makes me wonder how the heck did it work for me ! Or maybe... I've
> been using an R5xx which happens to have a bit that I haven't seen on
> R3xx that allows ... to set whether the GART reads come from HDP or
> directly from MC. That might be what saved my ass here.

I wonder.  But I really doubt it.  The bit is off by default
and the radeon DRM code explicitly sets it to off.

> We can do that by registering a surface from the kernel to cover the
> GART I suppose, and clean things a bit so that when using the DRI, X
> doesn't touch the surface registers -at all- and leaves it to the
> kernel.

That actually sounds like a good idea.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs.
  2009-02-14  9:11       ` David Miller
@ 2009-02-14  9:51         ` David Miller
  0 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2009-02-14  9:51 UTC (permalink / raw)
  To: benh; +Cc: airlied, dri-devel, linux-kernel

From: David Miller <davem@davemloft.net>
Date: Sat, 14 Feb 2009 01:11:45 -0800 (PST)

> From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Sat, 14 Feb 2009 20:07:54 +1100
> 
> > We can do that by registering a surface from the kernel to cover the
> > GART I suppose, and clean things a bit so that when using the DRI, X
> > doesn't touch the surface registers -at all- and leaves it to the
> > kernel.
> 
> That actually sounds like a good idea.

Ok I have it working, patch coming right up.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2009-02-14  9:51 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-12 10:15 [PATCH 1/5]: drm: ati_pcigart: Do not access I/O MEM space using pointer derefs David Miller
2009-02-12 10:35 ` Benjamin Herrenschmidt
2009-02-12 11:09   ` David Miller
2009-02-12 11:23     ` Dave Airlie
2009-02-12 22:17       ` David Miller
2009-02-12 21:26     ` Benjamin Herrenschmidt
2009-02-12 21:29       ` Benjamin Herrenschmidt
2009-02-14  6:09   ` David Miller
2009-02-14  7:42     ` Dave Airlie
2009-02-14  8:58       ` David Miller
2009-02-14  9:09       ` Benjamin Herrenschmidt
2009-02-14  9:07     ` Benjamin Herrenschmidt
2009-02-14  9:11       ` David Miller
2009-02-14  9:51         ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.