All of lore.kernel.org
 help / color / mirror / Atom feed
* some progress with radeon on C8000
@ 2019-09-28 21:44 Sven Schnelle
  2019-10-02 14:19 ` Thomas Bogendoerfer
  0 siblings, 1 reply; 6+ messages in thread
From: Sven Schnelle @ 2019-09-28 21:44 UTC (permalink / raw)
  To: linux-parisc; +Cc: deller

Hi List,

i've spent quite some time this evening debugging why the Fire GL
doesn't work in my C8000. As reading debug output didn't give me
much insights, i decided to throw some Hardware at the Problem and
connect a Logic Analyzer to the C8000. For that i switched to an old
PCI Radeon 7000 which shows the same ring test failure.

I captured a few traces:

First, from the card in a x86 PC where it's working:

https://stackframe.org/radeon.png

We can clearly see the radeon fetches the Ring descriptor via
DMA here. Note the DEADBEEF in the trace which is the value
written to the scratch register during the ring test.

On C8000, we can see reading the DMA descriptor fails, radeon
reads all zero:

https://stackframe.org/c8000_radeon.png

I had already a flush_cache_all() for testing in WREG32(), but it looks
like this wasn't enough. Adding one to radeon_ring_write() makes the
radeon happy:

https://stackframe.org/c8000_fixed.png

My assumption was that the zx1 chipset takes care about cache coherency,
but it looks like that's not happening. Does that problem ring any
bells for someone? Otherwise i'll continue investigating tomorrow.
(Almost midnight here)

dmesg looks better now, althought i don't consider adding
flush_cache_all() as a fix ;-)

[   21.186924] Linux agpgart interface v0.103
[   21.236890] quicksilver: IO PDIR shared with sba_iommu
[   21.343142]  (null): AGP aperture is 512M @ 0x60000000
[   21.397054] [drm] radeon kernel modesetting enabled.
[   21.457024] radeon 0000:60:04.0: remove_conflicting_pci_framebuffers: bar 0: 0xffffffffb0000000 -> 0xffffffffb7ffffff
[   21.586863] radeon 0000:60:04.0: remove_conflicting_pci_framebuffers: bar 2: 0xffffffffb80c0000 -> 0xffffffffb80cffff
[   21.719672] [drm] initializing kernel modesetting (RV100
0x1002:0x5159 0x1014:0x029A 0x00).
[   21.856909] __ioremap: ffffffffb80c0000 -> 0000000000050000
[   21.966904] __ioremap: ffffffffb0000000 -> 0000000010240000
[   22.066905] __ioremap: ffffffffb8080000 -> 00000000040a0000
[   22.136898] radeon 0000:60:04.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0x0000
[   22.276907] __ioremap: ffffffffb8080000 -> 0000000008120000
[   22.439336] radeon 0000:60:04.0: VRAM: 128M 0xFFFFFFFFB0000000 - 0xFFFFFFFFB7FFFFFF (8M used)
[   22.536865] radeon 0000:60:04.0: GTT: 512M 0xFFFFFFFF90000000 - 0xFFFFFFFFAFFFFFFF
[   22.626946] [drm] Detected VRAM RAM=128M, BAR=128M
[   22.686876] [drm] RAM width 32bits DDR
[   22.736874] [TTM] Zone  kernel: Available graphics memory: 2046222
KiB
[   22.806868] [TTM] Initializing pool allocator
[   22.866971] [drm] radeon: 8M of VRAM memory ready
[   22.916877] [drm] radeon: 512M of GTT memory ready.
[   22.976922] [drm] GART: num cpu pages 131072, num gpu pages 131072
[   23.161879] [drm] PCI GART of 512M enabled (table at 0x00000000400C0000).
[   23.236987] kmap: 000000413ee4f000
[   23.276887] radeon 0000:60:04.0: WB disabled
[   23.336865] radeon 0000:60:04.0: fence driver on ring 0 use gpu addr 0xffffffff90000000 and cpu addr 0x000000413ee4f000
[   23.466867] [drm] Supports vblank timestamp caching Rev 2
(21.10.2013).
[   23.536871] [drm] Driver supports precise vblank timestamp query.
[   23.616942] [drm] radeon: irq initialized.
[   23.666932] [drm] Loading R100 Microcode
[   23.756861] [drm] radeon: ring at 0xFFFFFFFF90001000
(debug dropped)
[   27.206908] [drm] ring test succeeded in 0 usecs
(debug dropped)
[   30.696921] [drm] ib test succeeded in 0 usecs
[   30.754178] [drm] No TV DAC info found in BIOS
[   30.806921] [drm] Radeon Display Connectors
[   30.856862] [drm] Connector 0:
[   30.886880] [drm]   VGA-1
[   30.916861] [drm]   DDC: 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60
[   30.996861] [drm]   Encoders:
[   31.026877] [drm]     CRT1: INTERNAL_DAC1
[   31.076861] [drm] Connector 1:
[   31.116866] [drm]   DVI-I-1
[   31.146877] [drm]   HPD2
[   31.176860] [drm]   DDC: 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c
[   31.246877] [drm]   Encoders:
[   31.286865] [drm]     CRT2: INTERNAL_DAC2
[   31.336860] [drm]     DFP1: INTERNAL_TMDS1
[   31.476904] __ioremap: ffffffffb0040000 -> 0000000010700000
[   31.580409] [drm] fb mappable at 0xFFFFFFFFB0040000
[   31.636856] [drm] vram apper at 0xFFFFFFFFB0000000
[   31.696854] [drm] size 786432
[   31.726855] [drm] fb depth is 8
[   31.766854] [drm]    pitch is 1024
[   32.006860] Console: switching to colour frame buffer device 128x48
[   32.119230] radeon 0000:60:04.0: fb0: radeondrmfb frame buffer device
[   32.197017] [drm] Initialized radeon 2.50.0 20080528 for 0000:60:04.0
on minor 0

Regards
Sven

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: some progress with radeon on C8000
  2019-09-28 21:44 some progress with radeon on C8000 Sven Schnelle
@ 2019-10-02 14:19 ` Thomas Bogendoerfer
  2019-10-02 20:37   ` John David Anglin
  2019-10-07  7:33   ` Sven Schnelle
  0 siblings, 2 replies; 6+ messages in thread
From: Thomas Bogendoerfer @ 2019-10-02 14:19 UTC (permalink / raw)
  To: Sven Schnelle; +Cc: linux-parisc, deller

On Sat, Sep 28, 2019 at 11:44:36PM +0200, Sven Schnelle wrote:
> Hi List,
> 
> i've spent quite some time this evening debugging why the Fire GL
> doesn't work in my C8000. As reading debug output didn't give me
> much insights, i decided to throw some Hardware at the Problem and
> connect a Logic Analyzer to the C8000. For that i switched to an old
> PCI Radeon 7000 which shows the same ring test failure.

below patch (with debug print left in) got PCI radeon working for me, when
I played with it last time.  The added fdc is a real fix, while the change
in parisc_agp_mask_memory is just a hack. The big problem there is to get
virtual address where the memory is mapped to in user space...

Thomas.


diff --git a/drivers/char/agp/parisc-agp.c b/drivers/char/agp/parisc-agp.c
index 15f2e7025b78..756bc4a265d9 100644
--- a/drivers/char/agp/parisc-agp.c
+++ b/drivers/char/agp/parisc-agp.c
@@ -20,6 +20,7 @@
 #include <linux/agp_backend.h>
 #include <linux/log2.h>
 #include <linux/slab.h>
+#include <linux/pagemap.h>
 
 #include <asm/parisc-device.h>
 #include <asm/ropes.h>
@@ -162,6 +163,16 @@ parisc_agp_insert_memory(struct agp_memory *mem, off_t pg_start, int type)
 			info->gatt[j] =
 				parisc_agp_mask_memory(agp_bridge,
 					paddr, type);
+			asm volatile("fdc %%r0(%0)" : : "r" (&info->gatt[j]));
+#if 0
+#if 0
+			printk("i %x j %lx page %p va %lx  paddr %lx gatt %lx\n",
+			       i, j, mem->pages[i], __va(paddr), paddr, info->gatt[j]);
+#else
+			printk("i %x j %lx page %p va %lx  paddr %lx\n",
+			       i, j, mem->pages[i], __va(paddr), paddr);
+#endif
+#endif
 		}
 	}
 
@@ -184,7 +195,7 @@ parisc_agp_remove_memory(struct agp_memory *mem, off_t pg_start, int type)
 	io_pg_start = info->io_pages_per_kpage * pg_start;
 	io_pg_count = info->io_pages_per_kpage * mem->page_count;
 	for (i = io_pg_start; i < io_pg_count + io_pg_start; i++) {
-		info->gatt[i] = agp_bridge->scratch_page;
+		// info->gatt[i] = agp_bridge->scratch_page;
 	}
 
 	agp_bridge->driver->tlb_flush(mem);
@@ -195,7 +206,22 @@ static unsigned long
 parisc_agp_mask_memory(struct agp_bridge_data *bridge, dma_addr_t addr,
 		       int type)
 {
-	return SBA_PDIR_VALID_BIT | addr;
+#if 1
+	u64 pa;
+	register unsigned ci; /* coherent index */
+	
+	pa = addr & IOVP_MASK;
+	mtsp(0,1);
+	asm("lci 0(%%sr1, %1), %0" : "=r" (ci) : "r" (__va(pa)));
+	
+	pa |= (ci >> PAGE_SHIFT) & 0xff;  /* move CI (8 bits) into lowest byte */
+
+	pa |= SBA_PDIR_VALID_BIT;	/* set "valid" bit */
+
+	return cpu_to_le64(pa);
+#else
+	return cpu_to_le64(SBA_PDIR_VALID_BIT | addr);
+#endif
 }
 
 static void

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: some progress with radeon on C8000
  2019-10-02 14:19 ` Thomas Bogendoerfer
@ 2019-10-02 20:37   ` John David Anglin
  2019-10-04 12:06     ` Thomas Bogendoerfer
  2019-10-07  7:33   ` Sven Schnelle
  1 sibling, 1 reply; 6+ messages in thread
From: John David Anglin @ 2019-10-02 20:37 UTC (permalink / raw)
  To: Thomas Bogendoerfer, Sven Schnelle; +Cc: linux-parisc, deller

On 2019-10-02 10:19 a.m., Thomas Bogendoerfer wrote:
> On Sat, Sep 28, 2019 at 11:44:36PM +0200, Sven Schnelle wrote:
>> Hi List,
>>
>> i've spent quite some time this evening debugging why the Fire GL
>> doesn't work in my C8000. As reading debug output didn't give me
>> much insights, i decided to throw some Hardware at the Problem and
>> connect a Logic Analyzer to the C8000. For that i switched to an old
>> PCI Radeon 7000 which shows the same ring test failure.
> below patch (with debug print left in) got PCI radeon working for me, when
> I played with it last time.  The added fdc is a real fix, while the change
> in parisc_agp_mask_memory is just a hack. The big problem there is to get
> virtual address where the memory is mapped to in user space...
>
> Thomas.
>
>
> diff --git a/drivers/char/agp/parisc-agp.c b/drivers/char/agp/parisc-agp.c
> index 15f2e7025b78..756bc4a265d9 100644
> --- a/drivers/char/agp/parisc-agp.c
> +++ b/drivers/char/agp/parisc-agp.c
> @@ -20,6 +20,7 @@
>  #include <linux/agp_backend.h>
>  #include <linux/log2.h>
>  #include <linux/slab.h>
> +#include <linux/pagemap.h>
>  
>  #include <asm/parisc-device.h>
>  #include <asm/ropes.h>
> @@ -162,6 +163,16 @@ parisc_agp_insert_memory(struct agp_memory *mem, off_t pg_start, int type)
>  			info->gatt[j] =
>  				parisc_agp_mask_memory(agp_bridge,
>  					paddr, type);
> +			asm volatile("fdc %%r0(%0)" : : "r" (&info->gatt[j]));
> +#if 0
> +#if 0
> +			printk("i %x j %lx page %p va %lx  paddr %lx gatt %lx\n",
> +			       i, j, mem->pages[i], __va(paddr), paddr, info->gatt[j]);
> +#else
> +			printk("i %x j %lx page %p va %lx  paddr %lx\n",
> +			       i, j, mem->pages[i], __va(paddr), paddr);
> +#endif
> +#endif
>  		}
>  	}
>  
> @@ -184,7 +195,7 @@ parisc_agp_remove_memory(struct agp_memory *mem, off_t pg_start, int type)
>  	io_pg_start = info->io_pages_per_kpage * pg_start;
>  	io_pg_count = info->io_pages_per_kpage * mem->page_count;
>  	for (i = io_pg_start; i < io_pg_count + io_pg_start; i++) {
> -		info->gatt[i] = agp_bridge->scratch_page;
> +		// info->gatt[i] = agp_bridge->scratch_page;
>  	}
>  
>  	agp_bridge->driver->tlb_flush(mem);
> @@ -195,7 +206,22 @@ static unsigned long
>  parisc_agp_mask_memory(struct agp_bridge_data *bridge, dma_addr_t addr,
>  		       int type)
>  {
> -	return SBA_PDIR_VALID_BIT | addr;
> +#if 1
> +	u64 pa;
> +	register unsigned ci; /* coherent index */
> +	
> +	pa = addr & IOVP_MASK;
> +	mtsp(0,1);
> +	asm("lci 0(%%sr1, %1), %0" : "=r" (ci) : "r" (__va(pa)));
I believe you can remove the mtsp and just use "lci 0(%1), %0" to load the coherence index.  The space
registers sr4 to sr7 are always 0 in kernel.

> +	
> +	pa |= (ci >> PAGE_SHIFT) & 0xff;  /* move CI (8 bits) into lowest byte */
> +
> +	pa |= SBA_PDIR_VALID_BIT;	/* set "valid" bit */
> +
> +	return cpu_to_le64(pa);
> +#else
> +	return cpu_to_le64(SBA_PDIR_VALID_BIT | addr);
> +#endif
>  }
>  
>  static void
>

Dave

-- 
John David Anglin  dave.anglin@bell.net


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: some progress with radeon on C8000
  2019-10-02 20:37   ` John David Anglin
@ 2019-10-04 12:06     ` Thomas Bogendoerfer
  2019-10-04 12:36       ` Helge Deller
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Bogendoerfer @ 2019-10-04 12:06 UTC (permalink / raw)
  To: John David Anglin; +Cc: Sven Schnelle, linux-parisc, deller

On Wed, Oct 02, 2019 at 04:37:41PM -0400, John David Anglin wrote:
> On 2019-10-02 10:19 a.m., Thomas Bogendoerfer wrote:
> > +	pa = addr & IOVP_MASK;
> > +	mtsp(0,1);
> > +	asm("lci 0(%%sr1, %1), %0" : "=r" (ci) : "r" (__va(pa)));
> I believe you can remove the mtsp and just use "lci 0(%1), %0" to load the coherence index.  The space
> registers sr4 to sr7 are always 0 in kernel.

ok, good to know.

while reading this I realized what the other hacks were for, which I didn't
include in my first mail. 

diff --git a/drivers/gpu/drm/ttm/ttm_agp_backend.c b/drivers/gpu/drm/ttm/ttm_agp_backend.c
index 028ab6007873..e84c7652eb1b 100644
--- a/drivers/gpu/drm/ttm/ttm_agp_backend.c
+++ b/drivers/gpu/drm/ttm/ttm_agp_backend.c
@@ -66,7 +67,8 @@ static int ttm_agp_bind(struct ttm_tt *ttm, struct ttm_mem_reg *bo_mem)
 		if (!page)
 			page = ttm->dummy_read_page;
 
-		mem->pages[mem->page_count++] = page;
+		mem->pages[(ttm->num_pages - 1) - mem->page_count] = page;
+		mem->page_count++;
 	}
 	agp_be->mem = mem;
 
diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
index d0459b392e5e..4bb301cab128 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_util.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
@@ -571,8 +571,14 @@ static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo,
 		 */
 		prot = ttm_io_prot(mem->placement, PAGE_KERNEL);
 		map->bo_kmap_type = ttm_bo_map_vmap;
+		printk("vmap %p\n", ttm->pages[start_page]);
+#if 0
 		map->virtual = vmap(ttm->pages + start_page, num_pages,
 				    0, prot);
+#else
+		map->virtual = kmap(ttm->pages[start_page]);
+#endif
+		
 	}
 	return (!map->virtual) ? -ENOMEM : 0;
 }

 
This is needed to be able to get the virtual address with __va(pa).

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: some progress with radeon on C8000
  2019-10-04 12:06     ` Thomas Bogendoerfer
@ 2019-10-04 12:36       ` Helge Deller
  0 siblings, 0 replies; 6+ messages in thread
From: Helge Deller @ 2019-10-04 12:36 UTC (permalink / raw)
  To: Thomas Bogendoerfer, John David Anglin; +Cc: Sven Schnelle, linux-parisc

On 04.10.19 14:06, Thomas Bogendoerfer wrote:
> On Wed, Oct 02, 2019 at 04:37:41PM -0400, John David Anglin wrote:
>> On 2019-10-02 10:19 a.m., Thomas Bogendoerfer wrote:
>>> +	pa = addr & IOVP_MASK;
>>> +	mtsp(0,1);
>>> +	asm("lci 0(%%sr1, %1), %0" : "=r" (ci) : "r" (__va(pa)));
>> I believe you can remove the mtsp and just use "lci 0(%1), %0" to load the coherence index.  The space
>> registers sr4 to sr7 are always 0 in kernel.
>
> ok, good to know.
>
> while reading this I realized what the other hacks were for, which I didn't
> include in my first mail.
>
> diff --git a/drivers/gpu/drm/ttm/ttm_agp_backend.c b/drivers/gpu/drm/ttm/ttm_agp_backend.c
> index 028ab6007873..e84c7652eb1b 100644
> --- a/drivers/gpu/drm/ttm/ttm_agp_backend.c
> +++ b/drivers/gpu/drm/ttm/ttm_agp_backend.c
> @@ -66,7 +67,8 @@ static int ttm_agp_bind(struct ttm_tt *ttm, struct ttm_mem_reg *bo_mem)
>   		if (!page)
>   			page = ttm->dummy_read_page;
>
> -		mem->pages[mem->page_count++] = page;
> +		mem->pages[(ttm->num_pages - 1) - mem->page_count] = page;
> +		mem->page_count++;
>   	}
>   	agp_be->mem = mem;
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c
> index d0459b392e5e..4bb301cab128 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c
> @@ -571,8 +571,14 @@ static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo,
>   		 */
>   		prot = ttm_io_prot(mem->placement, PAGE_KERNEL);
>   		map->bo_kmap_type = ttm_bo_map_vmap;
> +		printk("vmap %p\n", ttm->pages[start_page]);
> +#if 0
>   		map->virtual = vmap(ttm->pages + start_page, num_pages,
>   				    0, prot);
> +#else
> +		map->virtual = kmap(ttm->pages[start_page]);
> +#endif
> +
>   	}
>   	return (!map->virtual) ? -ENOMEM : 0;
>   }
>
>
> This is needed to be able to get the virtual address with __va(pa).

Can you make a documented patch out of all that?
I'd like to include it at least into a test/hack branch, e.g.
https://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git/commit/?h=radeon-test&id=0ef942c21d37078ae6406b3e7075f3dbe6417a04

Helge

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: some progress with radeon on C8000
  2019-10-02 14:19 ` Thomas Bogendoerfer
  2019-10-02 20:37   ` John David Anglin
@ 2019-10-07  7:33   ` Sven Schnelle
  1 sibling, 0 replies; 6+ messages in thread
From: Sven Schnelle @ 2019-10-07  7:33 UTC (permalink / raw)
  To: Thomas Bogendoerfer; +Cc: linux-parisc, deller

Hi Thomas,

On Wed, Oct 02, 2019 at 04:19:07PM +0200, Thomas Bogendoerfer wrote:
> On Sat, Sep 28, 2019 at 11:44:36PM +0200, Sven Schnelle wrote:
> > Hi List,
> > 
> > i've spent quite some time this evening debugging why the Fire GL
> > doesn't work in my C8000. As reading debug output didn't give me
> > much insights, i decided to throw some Hardware at the Problem and
> > connect a Logic Analyzer to the C8000. For that i switched to an old
> > PCI Radeon 7000 which shows the same ring test failure.
> 
> below patch (with debug print left in) got PCI radeon working for me, when
> I played with it last time.  The added fdc is a real fix, while the change
> in parisc_agp_mask_memory is just a hack. The big problem there is to get
> virtual address where the memory is mapped to in user space...

Thanks. I wasn't aware that you spent already the time to debug this.

Regards
Sven

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-10-07  7:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-28 21:44 some progress with radeon on C8000 Sven Schnelle
2019-10-02 14:19 ` Thomas Bogendoerfer
2019-10-02 20:37   ` John David Anglin
2019-10-04 12:06     ` Thomas Bogendoerfer
2019-10-04 12:36       ` Helge Deller
2019-10-07  7:33   ` Sven Schnelle

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.