linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: 2.6.10 dies when X uses PCI radeon 9200 SE, binary search result
       [not found] ` <fa.hinb9iv.s38127@ifi.uio.no>
@ 2005-01-22  9:40   ` Andreas Hartmann
  2005-01-22 13:01     ` Dave Airlie
  0 siblings, 1 reply; 16+ messages in thread
From: Andreas Hartmann @ 2005-01-22  9:40 UTC (permalink / raw)
  To: Helge Hafting, linux-kernel

Helge Hafting schrieb:
> On Fri, Jan 21, 2005 at 09:05:12PM +0100, Andreas Hartmann wrote:
>> Hello Helge,
>> 
>> Helge Hafting schrieb:
>> > On Sun, Jan 16, 2005 at 10:41:23PM +1100, Dave Airlie wrote:
>> >> > 
>> >> > I'm fine with adding this code, but we still don't know if this is the
>> >> > cause of his problem. The debug output can determine if this really is
>> >> > the source of the problem or if it is somewhere else.
>> >> > 
>> >> 
>> >> I actually doubt it is this stuff.. my guess is that it is something
>> >> nasty like ACPI breaking int10 for X or something like that... it
>> >> seems a lot more subtle than the usually things that break when we
>> >> mess with the DRM :-)
>> 
>> Which glibc do you use? I have problems with glibc 2.3.4, kernel 2.4.x and
>> X / Xorg while executing the int10-code of X. glibc 2.3.3 works fine for
>> me. But I could find another posting, which describes, that there are even
>> problems with glibc 2.3.3 and kernel 2.4.x.
>> 
>> It's new for me, that there could be problems with kernelversions of 2.6, too.
>> 
>> Therefore, it would be really interessting to know, which glibc version
>> you are using.
>> 
> I use glibc 2.3.2 from debian testing (or unstable).  
> This is not the problem though, because a reboot into 2.6.8.1 makes
> X work without crashing.  The crash only happens with 2.6.9-rc2
> or later kernels.

Did you try another version of glibc?

> So the only way glibc could be the culprit, is if the newer kernel
> exports some new interface that this glibc manages to mess up.  Still,
> even a buggy glibc shouldn't hang the kernel anyway.

That's certainly correct.

> Such issues
> could crash (all) user apps, but shouldn't prevent the machine from
> responding to sysrq sequences.

You emphasized the differences of the effects. But there is one reason in
all cases which I know: int10 crashes X or even the whole kernel.

I could debug the problem to the following point:

--------------------------------------------------------------------------
static int
vm86_rep(struct vm86_struct *ptr)
{
    int __res;

#ifdef __PIC__
    /* When compiling with -fPIC, we can't use asm constraint "b" because
       %ebx is already taken by gcc. */
    __asm__ __volatile__("pushl %%ebx\n\t"
                         "movl %2,%%ebx\n\t"
                         "movl %1,%%eax\n\t"
                         "int $0x80\n\t"
                         "popl %%ebx"
                         :"=a" (__res)
                         :"n" ((int)113), "r" ((struct vm86_struct *)ptr));
#else
    __asm__ __volatile__("int $0x80\n\t"
                         :"=a" (__res):"a" ((int)113),
                         "b" ((struct vm86_struct *)ptr));
#endif
/* Comment from me */
xf86MsgVerb(X_INFO,3,"my comment\n");
            if (__res < 0) {
                errno = -__res;
                __res = -1;
            }
            else errno = 0;
            return __res;
}

#endif
-----------------------------------------------------------------------

I could see, that X crashes in glibc 2.3.4 with kernel 2.4.x (not with
kernel 2.6.x, x <= 10, x > 10 not tested) during the first malloc syscall
after int10 to execute the function
xf86MsgVerb(X_INFO,3,"my comment\n");


The crashes depend on different versions of used software:

glibc 2.3.3 or 2.3.4 with kernel 2.4.x
glibc 2.3.2 with kernel > 2.6.9rc2

I asked a X developper, but he couldn't help until now, too.


I can't say, if glibc or the kernel could be the problem. You can't relate
it reliable neither to glibc nor to the kernel nor to X. Therefore, it
_seems_ to me, nobody really cares about the problem.

I'm willing to help to find the problem - but I'm neither a kernel
developper, nor a glibc developper nor a X developper. I'm depending on
the support of the developpers.

I think, there should work one developper of each application together to
find the problem. I could ask a X developper, which I know, if he is
willing to help to find the problem together with a developper from the
kernel and from the glibc (I don't know, who to ask from the glibc-team).


Kind regards,
Andreas Hartmann

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses PCI radeon 9200 SE, binary search result
  2005-01-22  9:40   ` 2.6.10 dies when X uses PCI radeon 9200 SE, binary search result Andreas Hartmann
@ 2005-01-22 13:01     ` Dave Airlie
  2005-01-26 10:05       ` 2.6.10 dies when X uses PCI radeon 9200 SE, further " Helge Hafting
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Airlie @ 2005-01-22 13:01 UTC (permalink / raw)
  To: Andreas Hartmann; +Cc: Helge Hafting, linux-kernel

> >>
> 
> That's certainly correct.
> 
> > Such issues
> > could crash (all) user apps, but shouldn't prevent the machine from
> > responding to sysrq sequences.
> 
> You emphasized the differences of the effects. But there is one reason in
> all cases which I know: int10 crashes X or even the whole kernel.
> 
> I could debug the problem to the following point:
> 
> 
> I could see, that X crashes in glibc 2.3.4 with kernel 2.4.x (not with
> kernel 2.6.x, x <= 10, x > 10 not tested) during the first malloc syscall
> after int10 to execute the function
> xf86MsgVerb(X_INFO,3,"my comment\n");
> 
> The crashes depend on different versions of used software:
> 
> glibc 2.3.3 or 2.3.4 with kernel 2.4.x
> glibc 2.3.2 with kernel > 2.6.9rc2
> 


Well if you can track down which patch in -rc2 causes it then we can
annoy the person who created it, if you build some kernels from the bk
snapshots it might help as -rc2 is quite large vs -rc1..

Dave.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses PCI radeon 9200 SE, further binary search result
  2005-01-22 13:01     ` Dave Airlie
@ 2005-01-26 10:05       ` Helge Hafting
  2005-01-30 11:16         ` Helge Hafting
  0 siblings, 1 reply; 16+ messages in thread
From: Helge Hafting @ 2005-01-26 10:05 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Andreas Hartmann, Helge Hafting, linux-kernel

Dave Airlie wrote:

>
>Well if you can track down which patch in -rc2 causes it then we can
>annoy the person who created it, if you build some kernels from the bk
>snapshots it might help as -rc2 is quite large vs -rc1..
>
>  
>
So far, 2.6.9-rc1-bk10 works (X starts without hang, log indicates drm 
works too).

2.6.9-rc1-bk15 started X without a hang, but failed drm according to the 
log:
(II) RADEON(0): [drm] created "radeon" driver at busid "PCI:0:8:0"
(II) RADEON(0): [drm] added 8192 byte SAREA at 0xe0932000
(II) RADEON(0): [drm] mapped SAREA 0xe0932000 to 0xafd8d000
(II) RADEON(0): [drm] framebuffer handle = 0xe0000000
(II) RADEON(0): [drm] added 1 reserved context for kernel
(EE) RADEON(0): [pci] Out of memory (-1007)
(II) RADEON(0): [drm] removed 1 reserved context for kernel
(II) RADEON(0): [drm] unmapping 8192 bytes of SAREA 0xe0932000 at 0xafd8d000
(II) RADEON(0): Memory manager initialized to (0,0) (1280,8191)

A normal lok (bk10) looks like this:
(II) RADEON(0): [drm] created "radeon" driver at busid "PCI:0:8:0"
(II) RADEON(0): [drm] added 8192 byte SAREA at 0xe0932000
(II) RADEON(0): [drm] mapped SAREA 0xe0932000 to 0xafd8d000
(II) RADEON(0): [drm] framebuffer handle = 0xe0000000
(II) RADEON(0): [drm] added 1 reserved context for kernel
(II) RADEON(0): [pci] 8192 kB allocated with handle 0xe0935000
(II) RADEON(0): [pci] ring handle = 0xe0935000
(II) RADEON(0): [pci] Ring mapped at 0xafc8c000
(II) RADEON(0): [pci] Ring contents 0x00000000
(II) RADEON(0): [pci] ring read ptr handle = 0xe0a36000
(II) RADEON(0): [pci] Ring read ptr mapped at 0xafc8b000
(II) RADEON(0): [pci] Ring read ptr contents 0x00000000
(II) RADEON(0): [pci] vertex/indirect buffers handle = 0xe0a37000
(II) RADEON(0): [pci] Vertex/indirect buffers mapped at 0xafa8b000
(II) RADEON(0): [pci] Vertex/indirect buffers contents 0x00000000
(II) RADEON(0): [pci] GART texture map handle = 0xe0c37000
(II) RADEON(0): [pci] GART Texture map mapped at 0xaf5ab000
(II) RADEON(0): [drm] register handle = 0xf6000000
(II) RADEON(0): [dri] Visual configs initialized
(II) RADEON(0): CP in BM mode
(II) RADEON(0): Using 8 MB GART aperture
(II) RADEON(0): Using 1 MB for the ring buffer
(II) RADEON(0): Using 2 MB for vertex/indirect buffers
(II) RADEON(0): Using 5 MB for GART textures
(II) RADEON(0): Memory manager initialized to (0,0) (1280,8191)


What is the most useful to do now?
Binary searching for the crash between bk15 and rc2?   Or:
Binary searching for the "out of memory" between bk10 and bk15?

Helge Hafting



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses PCI radeon 9200 SE, further binary search result
  2005-01-26 10:05       ` 2.6.10 dies when X uses PCI radeon 9200 SE, further " Helge Hafting
@ 2005-01-30 11:16         ` Helge Hafting
  2005-01-30 11:22           ` Dave Airlie
  0 siblings, 1 reply; 16+ messages in thread
From: Helge Hafting @ 2005-01-30 11:16 UTC (permalink / raw)
  To: airlied; +Cc: Andreas Hartmann, linux-kernel


>> What is the most useful to do now?
>> Binary searching for the crash between bk15 and rc2?   Or:]
>
> I'd keep looking for the crash... the out of memory will probably
> disappear with a later snapshot..


After sorting out a stupid build problem, here is the result for
the binary search for the crash.
2.6.9         crash
2.6.9-rc2     pci-oom
2.6.9-rc3     crash
2.6.9-rc2-bk7 crash
2.6.9-rc2-bk4 crash
2.6.9-rc2-bk2 pci-oom
2.6.9-rc2-bk3 krash in ifconfig  

Up to 2.6.9-rc2-bk2 we don't get a crash, instead the X log shows this:

(EE) RADEON(0): [pci] Out of memory (-1007)

and gives up on drm in an orderly fashion.  
2.6.9-rc2-bk4 crashes though.  As usual, the X log ends with:
(II) LoadModule: "int10"
(II) Reloading /usr/X11R6/lib/modules/linux/libint10.a
(II) RADEON(0): initializing int10
(**) RADEON(0): Option "InitPrimary" "on"
(II) Truncating PCI BIOS Length to 53248


2.6.9-rc2-bk3 wasn't tested further, because the kernel dies upon
running "ifconfig" which must be some other temporary bug.
X will probably be difficult without IPv4 anyway.

Helge Hafting 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses PCI radeon 9200 SE, further binary search result
  2005-01-30 11:16         ` Helge Hafting
@ 2005-01-30 11:22           ` Dave Airlie
  2005-01-30 14:43             ` Jon Smirl
                               ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Dave Airlie @ 2005-01-30 11:22 UTC (permalink / raw)
  To: Helge Hafting; +Cc: Andreas Hartmann, linux-kernel, Jon Smirl

> 
> 
> After sorting out a stupid build problem, here is the result for
> the binary search for the crash.
> 2.6.9         crash
> 2.6.9-rc2     pci-oom
> 2.6.9-rc3     crash
> 2.6.9-rc2-bk7 crash
> 2.6.9-rc2-bk4 crash
> 2.6.9-rc2-bk2 pci-oom
> 2.6.9-rc2-bk3 krash in ifconfig
> 
> Up to 2.6.9-rc2-bk2 we don't get a crash, instead the X log shows this:
> 
> (EE) RADEON(0): [pci] Out of memory (-1007)
> 
> and gives up on drm in an orderly fashion.
> 2.6.9-rc2-bk4 crashes though.  As usual, the X log ends with:
> (II) LoadModule: "int10"
> (II) Reloading /usr/X11R6/lib/modules/linux/libint10.a
> (II) RADEON(0): initializing int10
> (**) RADEON(0): Option "InitPrimary" "on"
> (II) Truncating PCI BIOS Length to 53248
> 
> 2.6.9-rc2-bk3 wasn't tested further, because the kernel dies upon
> running "ifconfig" which must be some other temporary bug.
> X will probably be difficult without IPv4 anyway.
> 

Just another guess, but Jon could the PCI ROM patch mess up X's access
via the Int10 handler .. maybe if it isn't mapped properly..?

Dave.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses PCI radeon 9200 SE, further binary search result
  2005-01-30 11:22           ` Dave Airlie
@ 2005-01-30 14:43             ` Jon Smirl
  2005-01-30 15:05             ` Jon Smirl
  2005-01-30 15:07             ` 2.6.10 dies when X uses PCI radeon 9200 SE, further binary search result Jon Smirl
  2 siblings, 0 replies; 16+ messages in thread
From: Jon Smirl @ 2005-01-30 14:43 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Helge Hafting, Andreas Hartmann, linux-kernel

On Sun, 30 Jan 2005 22:22:24 +1100, Dave Airlie <airlied@gmail.com> wrote:
> Just another guess, but Jon could the PCI ROM patch mess up X's access
> via the Int10 handler .. maybe if it isn't mapped properly..?

The ROM patch is inactive until you echo something to the sysfs ROM variable.

> 
> Dave.
> 


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses PCI radeon 9200 SE, further binary search result
  2005-01-30 11:22           ` Dave Airlie
  2005-01-30 14:43             ` Jon Smirl
@ 2005-01-30 15:05             ` Jon Smirl
  2005-01-30 16:32               ` Helge Hafting
  2005-01-30 15:07             ` 2.6.10 dies when X uses PCI radeon 9200 SE, further binary search result Jon Smirl
  2 siblings, 1 reply; 16+ messages in thread
From: Jon Smirl @ 2005-01-30 15:05 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Helge Hafting, Andreas Hartmann, linux-kernel

I just checked out on current Linus BK with my AGP Radeon 9000 which
is pretty close to a 9200. Everything is working fine.

I notice from his logs that he is running a PCI radeon, not an AGP
one. Didn't someone make some changes to the PCI radeon memory
management code recently? I run a PCI R128 and that is still working.
DRM debug output might give more clues.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses PCI radeon 9200 SE, further binary search result
  2005-01-30 11:22           ` Dave Airlie
  2005-01-30 14:43             ` Jon Smirl
  2005-01-30 15:05             ` Jon Smirl
@ 2005-01-30 15:07             ` Jon Smirl
  2 siblings, 0 replies; 16+ messages in thread
From: Jon Smirl @ 2005-01-30 15:07 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Helge Hafting, Andreas Hartmann, linux-kernel

Doesn't PCI:0:8:0 have to be on a PCI bus? AGP would look like
PCI:1:0:0 or PCI:2:0:0.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses PCI radeon 9200 SE, further binary search result
  2005-01-30 15:05             ` Jon Smirl
@ 2005-01-30 16:32               ` Helge Hafting
  2005-01-30 17:05                 ` Jon Smirl
  0 siblings, 1 reply; 16+ messages in thread
From: Helge Hafting @ 2005-01-30 16:32 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Dave Airlie, Helge Hafting, Andreas Hartmann, linux-kernel

On Sun, Jan 30, 2005 at 10:05:16AM -0500, Jon Smirl wrote:
> I just checked out on current Linus BK with my AGP Radeon 9000 which
> is pretty close to a 9200. Everything is working fine.
> 
> I notice from his logs that he is running a PCI radeon, not an AGP
> one. Didn't someone make some changes to the PCI radeon memory
> management code recently? I run a PCI R128 and that is still working.
> DRM debug output might give more clues.
> 
Yes, it is a PCI radeon.  And the machine has an AGP slot
too, which is used by a matrox G550.  This AGP card was not
used in the test, (other than being the VGA console).
Note that there is no crash if I don't compile 
AGP support, so the crash is related to AGP somehow even though
AGP is not supposed to be used in this case.

As I start X (on the radeon) I notice that the VGA console 
I'm using (on the G550 AGP) goes black.  I see no need for that either,
the radeon display is a _different_ device so why black out 
the vgacon?  Could the problem lurk there somehow?

Helge Hafting


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses PCI radeon 9200 SE, further binary search result
  2005-01-30 16:32               ` Helge Hafting
@ 2005-01-30 17:05                 ` Jon Smirl
  2005-01-30 21:56                   ` Helge Hafting
  2005-02-04  3:43                   ` 2.6.10 dies when X uses G550 Helge Hafting
  0 siblings, 2 replies; 16+ messages in thread
From: Jon Smirl @ 2005-01-30 17:05 UTC (permalink / raw)
  To: Helge Hafting; +Cc: Dave Airlie, Andreas Hartmann, linux-kernel

On Sun, 30 Jan 2005 17:32:41 +0100, Helge Hafting
<helgehaf@aitel.hist.no> wrote:
> Yes, it is a PCI radeon.  And the machine has an AGP slot
> too, which is used by a matrox G550.  This AGP card was not
> used in the test, (other than being the VGA console).
> Note that there is no crash if I don't compile
> AGP support, so the crash is related to AGP somehow even though
> AGP is not supposed to be used in this case.

Can you set the PCI card to be primary in your BIOS or remove the AGP
card, and then see if it works? It could be that X's video reset code
for secondary PCI cards is broken.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses PCI radeon 9200 SE, further binary search result
  2005-01-30 17:05                 ` Jon Smirl
@ 2005-01-30 21:56                   ` Helge Hafting
  2005-02-04  3:43                   ` 2.6.10 dies when X uses G550 Helge Hafting
  1 sibling, 0 replies; 16+ messages in thread
From: Helge Hafting @ 2005-01-30 21:56 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Dave Airlie, Andreas Hartmann, linux-kernel

On Sun, Jan 30, 2005 at 12:05:27PM -0500, Jon Smirl wrote:
> On Sun, 30 Jan 2005 17:32:41 +0100, Helge Hafting
> <helgehaf@aitel.hist.no> wrote:
> > Yes, it is a PCI radeon.  And the machine has an AGP slot
> > too, which is used by a matrox G550.  This AGP card was not
> > used in the test, (other than being the VGA console).
> > Note that there is no crash if I don't compile
> > AGP support, so the crash is related to AGP somehow even though
> > AGP is not supposed to be used in this case.
> 
> Can you set the PCI card to be primary in your BIOS or remove the AGP
> card, and then see if it works? It could be that X's video reset code
> for secondary PCI cards is broken.
> 
I set the PCI card to primary, and kept the AGP card. Then I booted up
2.6.9-rc3 which normally crashes hard when X starts.  

But now X came up just fine on the radeon!  The log indicates
no problems with drm either, I did not get a "pci oom".
I didn't actually test with glxgears, but drm came up according
to the logs.
I did not change the X setup, so the pci card was initialized by
int10 although that wasn't necessary this time.


I have not yet tested wether the AGP card works in this configuration, my
user was impatient so I had to restore a known working configuration.

Helge Hafting


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses G550
  2005-01-30 17:05                 ` Jon Smirl
  2005-01-30 21:56                   ` Helge Hafting
@ 2005-02-04  3:43                   ` Helge Hafting
  2005-02-04  4:47                     ` Jon Smirl
  2005-02-04  5:57                     ` Dave Airlie
  1 sibling, 2 replies; 16+ messages in thread
From: Helge Hafting @ 2005-02-04  3:43 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Dave Airlie, Andreas Hartmann, linux-kernel

On Sun, Jan 30, 2005 at 12:05:27PM -0500, Jon Smirl wrote:
> On Sun, 30 Jan 2005 17:32:41 +0100, Helge Hafting
> <helgehaf@aitel.hist.no> wrote:
> > Yes, it is a PCI radeon.  And the machine has an AGP slot
> > too, which is used by a matrox G550.  This AGP card was not
> > used in the test, (other than being the VGA console).
> > Note that there is no crash if I don't compile
> > AGP support, so the crash is related to AGP somehow even though
> > AGP is not supposed to be used in this case.
> 
> Can you set the PCI card to be primary in your BIOS or remove the AGP
> card, and then see if it works? It could be that X's video reset code
> for secondary PCI cards is broken.
> 
I tried this. I already reported that X came up on the radeon.
I could not run X on the G550, now that it was "secondary",
but the crash was different from the radeon crash.

The logs with secondary radeon used to end like this:
(II) LoadModule: "int10"
(II) Reloading /usr/X11R6/lib/modules/linux/libint10.a
(II) RADEON(0): initializing int10
(**) RADEON(0): Option "InitPrimary" "on"
(II) Truncating PCI BIOS Length to 53248


The logs for secondary G550 ends like this, with or without int10
(--) MGA(0): Pseudo-DMA transfer window at 0xF3000000
(==) MGA(0): BIOS at 0xC0000
(WW) MGA(0): Video BIOS info block not detected!
(II) MGA(0): MGABios.RamdacType = 0x0
(==) MGA(0): Write-combining range (0xf0000000,0x2000000)
(--) MGA(0): VideoRAM: 2048 kByte
(II) Loading sub module "ddc"
(II) LoadModule: "ddc"
(II) Reloading /usr/X11R6/lib/modules/libddc.a
(II) Loading sub module "i2c"
(II) LoadModule: "i2c"
(II) Loading /usr/X11R6/lib/modules/libi2c.a
(II) Module i2c: vendor="The XFree86 Project"
        compiled for 4.3.0.1, module version = 1.2.0
        ABI class: XFree86 Video Driver, version 0.6
(==) MGA(0): Write-combining range (0xf0000000,0x200000)
(II) MGA(0): vgaHWGetIOBase: hwp->IOBase is 0x03d0, hwp->PIOOffset is 0x0000
(II) MGA(0): I2C bus "DDC" initialized.
(II) MGA(0): I2C device "DDC:ddc2" registered at address 0xA0.
(II) MGA(0): I2C device "DDC:ddc2" removed.
(II) MGA(0): I2C Monitor info: (nil)
(II) MGA(0): end of I2C Monitor info

The video bios is apparently not detected at all, and therefore not run.

The kernel doesn't actually hang, only X gets stuck.  sysrq+T
dumped stack traces for all tasks except the xserver.  For X,
there was only one line identifying the xserver process and the fact
that it was Running.  No stack dump.  I managed to kill all tasks
and have init proceeding into init 2.  

So I wonder - is Debians X 4.3.0.1 buggy, or the kernel?
The fact remains that the newer kernels locks up while trying to use the
secondary radeon, while it actually works with 2.6.8.1.

Well, actually 2.6.8.1 is a bit unstable once 3D stuff starts on the
radeon - but it is only the radeon xserver that locks up in an
infinite loop after a while. Other processes remain responsive.

Helge Hafting

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses G550
  2005-02-04  3:43                   ` 2.6.10 dies when X uses G550 Helge Hafting
@ 2005-02-04  4:47                     ` Jon Smirl
  2005-02-04  5:57                     ` Dave Airlie
  1 sibling, 0 replies; 16+ messages in thread
From: Jon Smirl @ 2005-02-04  4:47 UTC (permalink / raw)
  To: Helge Hafting
  Cc: Dave Airlie, Andreas Hartmann, linux-kernel, Xserver development

This appears to me to be a problem with the drivers in the X server.
DRM isn't active yet so I don't think the problem is there. There may
have been a kernel change that caused BIOS reset to stop working.

X does nasty things to the PCI bus from user space and there are many
ways that X and the kernel could interfere with each other. Maybe some
one that owns a PCI video card and a Matrox G550 can step through this
in a debugger and see what is happening.

You can look at the contents in of the video BIOS ROMs in recent
kernels. Go into sys and find your video card. echo 1 >rom. That will
enable the rom access code. You can then hexdump the ROM contents.

This is a small program for running a reset on video cards. It will
reset all of your cards. You might want to try running it. If it hangs
it will be easier to debug than an X server.
ftp://ftp.scitechsoft.com/devel/obsolete/x86emu/x86emu-0.8.tar.gz

I added the X server dev list on the CC.


On Fri, 4 Feb 2005 04:43:04 +0100, Helge Hafting <helgehaf@aitel.hist.no> wrote:
> On Sun, Jan 30, 2005 at 12:05:27PM -0500, Jon Smirl wrote:
> > On Sun, 30 Jan 2005 17:32:41 +0100, Helge Hafting
> > <helgehaf@aitel.hist.no> wrote:
> > > Yes, it is a PCI radeon.  And the machine has an AGP slot
> > > too, which is used by a matrox G550.  This AGP card was not
> > > used in the test, (other than being the VGA console).
> > > Note that there is no crash if I don't compile
> > > AGP support, so the crash is related to AGP somehow even though
> > > AGP is not supposed to be used in this case.
> >
> > Can you set the PCI card to be primary in your BIOS or remove the AGP
> > card, and then see if it works? It could be that X's video reset code
> > for secondary PCI cards is broken.
> >
> I tried this. I already reported that X came up on the radeon.
> I could not run X on the G550, now that it was "secondary",
> but the crash was different from the radeon crash.
> 
> The logs with secondary radeon used to end like this:
> (II) LoadModule: "int10"
> (II) Reloading /usr/X11R6/lib/modules/linux/libint10.a
> (II) RADEON(0): initializing int10
> (**) RADEON(0): Option "InitPrimary" "on"
> (II) Truncating PCI BIOS Length to 53248
> 
> The logs for secondary G550 ends like this, with or without int10
> (--) MGA(0): Pseudo-DMA transfer window at 0xF3000000
> (==) MGA(0): BIOS at 0xC0000
> (WW) MGA(0): Video BIOS info block not detected!
> (II) MGA(0): MGABios.RamdacType = 0x0
> (==) MGA(0): Write-combining range (0xf0000000,0x2000000)
> (--) MGA(0): VideoRAM: 2048 kByte
> (II) Loading sub module "ddc"
> (II) LoadModule: "ddc"
> (II) Reloading /usr/X11R6/lib/modules/libddc.a
> (II) Loading sub module "i2c"
> (II) LoadModule: "i2c"
> (II) Loading /usr/X11R6/lib/modules/libi2c.a
> (II) Module i2c: vendor="The XFree86 Project"
>         compiled for 4.3.0.1, module version = 1.2.0
>         ABI class: XFree86 Video Driver, version 0.6
> (==) MGA(0): Write-combining range (0xf0000000,0x200000)
> (II) MGA(0): vgaHWGetIOBase: hwp->IOBase is 0x03d0, hwp->PIOOffset is 0x0000
> (II) MGA(0): I2C bus "DDC" initialized.
> (II) MGA(0): I2C device "DDC:ddc2" registered at address 0xA0.
> (II) MGA(0): I2C device "DDC:ddc2" removed.
> (II) MGA(0): I2C Monitor info: (nil)
> (II) MGA(0): end of I2C Monitor info
> 
> The video bios is apparently not detected at all, and therefore not run.
> 
> The kernel doesn't actually hang, only X gets stuck.  sysrq+T
> dumped stack traces for all tasks except the xserver.  For X,
> there was only one line identifying the xserver process and the fact
> that it was Running.  No stack dump.  I managed to kill all tasks
> and have init proceeding into init 2.
> 
> So I wonder - is Debians X 4.3.0.1 buggy, or the kernel?
> The fact remains that the newer kernels locks up while trying to use the
> secondary radeon, while it actually works with 2.6.8.1.
> 
> Well, actually 2.6.8.1 is a bit unstable once 3D stuff starts on the
> radeon - but it is only the radeon xserver that locks up in an
> infinite loop after a while. Other processes remain responsive.
> 
> Helge Hafting
> 


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses G550
  2005-02-04  3:43                   ` 2.6.10 dies when X uses G550 Helge Hafting
  2005-02-04  4:47                     ` Jon Smirl
@ 2005-02-04  5:57                     ` Dave Airlie
  2005-02-06 10:10                       ` Dave Airlie
  1 sibling, 1 reply; 16+ messages in thread
From: Dave Airlie @ 2005-02-04  5:57 UTC (permalink / raw)
  To: Helge Hafting; +Cc: Jon Smirl, Andreas Hartmann, linux-kernel

> The logs with secondary radeon used to end like this:
> (II) LoadModule: "int10"
> (II) Reloading /usr/X11R6/lib/modules/linux/libint10.a
> (II) RADEON(0): initializing int10
> (**) RADEON(0): Option "InitPrimary" "on"
> (II) Truncating PCI BIOS Length to 53248
> 
> The logs for secondary G550 ends like this, with or without int10
> (--) MGA(0): Pseudo-DMA transfer window at 0xF3000000
> (==) MGA(0): BIOS at 0xC0000
> (WW) MGA(0): Video BIOS info block not detected!
> (II) MGA(0): MGABios.RamdacType = 0x0
> (==) MGA(0): Write-combining range (0xf0000000,0x2000000)
> (--) MGA(0): VideoRAM: 2048 kByte
> (II) Loading sub module "ddc"
> (II) LoadModule: "ddc"
> (II) Reloading /usr/X11R6/lib/modules/libddc.a
> (II) Loading sub module "i2c"
> (II) LoadModule: "i2c"
> (II) Loading /usr/X11R6/lib/modules/libi2c.a
> (II) Module i2c: vendor="The XFree86 Project"
>         compiled for 4.3.0.1, module version = 1.2.0
>         ABI class: XFree86 Video Driver, version 0.6
> (==) MGA(0): Write-combining range (0xf0000000,0x200000)
> (II) MGA(0): vgaHWGetIOBase: hwp->IOBase is 0x03d0, hwp->PIOOffset is 0x0000
> (II) MGA(0): I2C bus "DDC" initialized.
> (II) MGA(0): I2C device "DDC:ddc2" registered at address 0xA0.
> (II) MGA(0): I2C device "DDC:ddc2" removed.
> (II) MGA(0): I2C Monitor info: (nil)
> (II) MGA(0): end of I2C Monitor info
> 
> The video bios is apparently not detected at all, and therefore not run.
> 
> The kernel doesn't actually hang, only X gets stuck.  sysrq+T
> dumped stack traces for all tasks except the xserver.  For X,
> there was only one line identifying the xserver process and the fact
> that it was Running.  No stack dump.  I managed to kill all tasks
> and have init proceeding into init 2.
> 
> So I wonder - is Debians X 4.3.0.1 buggy, or the kernel?
> The fact remains that the newer kernels locks up while trying to use the
> secondary radeon, while it actually works with 2.6.8.1.

I've had some luck in reproducing this, however I've had to retask my
test machine to find some hangs in my real life application (can run
for 5 or 6 days without crashing :-), so I might get back to looking
for this at some stage but when is anybodys guess, all I did was take
a Radeon AGP card, and a  PCI SiS crappy card and ran X on 2.6.10 and
it hung....

Dave.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses G550
  2005-02-04  5:57                     ` Dave Airlie
@ 2005-02-06 10:10                       ` Dave Airlie
  0 siblings, 0 replies; 16+ messages in thread
From: Dave Airlie @ 2005-02-06 10:10 UTC (permalink / raw)
  To: Helge Hafting; +Cc: Jon Smirl, Andreas Hartmann, linux-kernel

https://bugs.freedesktop.org/show_bug.cgi?id=2431

might have something to do with this...

Dave.

On Fri, 4 Feb 2005 16:57:50 +1100, Dave Airlie <airlied@gmail.com> wrote:
> > The logs with secondary radeon used to end like this:
> > (II) LoadModule: "int10"
> > (II) Reloading /usr/X11R6/lib/modules/linux/libint10.a
> > (II) RADEON(0): initializing int10
> > (**) RADEON(0): Option "InitPrimary" "on"
> > (II) Truncating PCI BIOS Length to 53248
> >
> > The logs for secondary G550 ends like this, with or without int10
> > (--) MGA(0): Pseudo-DMA transfer window at 0xF3000000
> > (==) MGA(0): BIOS at 0xC0000
> > (WW) MGA(0): Video BIOS info block not detected!
> > (II) MGA(0): MGABios.RamdacType = 0x0
> > (==) MGA(0): Write-combining range (0xf0000000,0x2000000)
> > (--) MGA(0): VideoRAM: 2048 kByte
> > (II) Loading sub module "ddc"
> > (II) LoadModule: "ddc"
> > (II) Reloading /usr/X11R6/lib/modules/libddc.a
> > (II) Loading sub module "i2c"
> > (II) LoadModule: "i2c"
> > (II) Loading /usr/X11R6/lib/modules/libi2c.a
> > (II) Module i2c: vendor="The XFree86 Project"
> >         compiled for 4.3.0.1, module version = 1.2.0
> >         ABI class: XFree86 Video Driver, version 0.6
> > (==) MGA(0): Write-combining range (0xf0000000,0x200000)
> > (II) MGA(0): vgaHWGetIOBase: hwp->IOBase is 0x03d0, hwp->PIOOffset is 0x0000
> > (II) MGA(0): I2C bus "DDC" initialized.
> > (II) MGA(0): I2C device "DDC:ddc2" registered at address 0xA0.
> > (II) MGA(0): I2C device "DDC:ddc2" removed.
> > (II) MGA(0): I2C Monitor info: (nil)
> > (II) MGA(0): end of I2C Monitor info
> >
> > The video bios is apparently not detected at all, and therefore not run.
> >
> > The kernel doesn't actually hang, only X gets stuck.  sysrq+T
> > dumped stack traces for all tasks except the xserver.  For X,
> > there was only one line identifying the xserver process and the fact
> > that it was Running.  No stack dump.  I managed to kill all tasks
> > and have init proceeding into init 2.
> >
> > So I wonder - is Debians X 4.3.0.1 buggy, or the kernel?
> > The fact remains that the newer kernels locks up while trying to use the
> > secondary radeon, while it actually works with 2.6.8.1.
> 
> I've had some luck in reproducing this, however I've had to retask my
> test machine to find some hangs in my real life application (can run
> for 5 or 6 days without crashing :-), so I might get back to looking
> for this at some stage but when is anybodys guess, all I did was take
> a Radeon AGP card, and a  PCI SiS crappy card and ran X on 2.6.10 and
> it hung....
> 
> Dave.
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 2.6.10 dies when X uses PCI radeon 9200 SE, further binary search result
@ 2005-01-30 19:00 Zoltan Boszormenyi
  0 siblings, 0 replies; 16+ messages in thread
From: Zoltan Boszormenyi @ 2005-01-30 19:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: Helge Hafting, Jon Smirl

> On Sun, Jan 30, 2005 at 10:05:16AM -0500, Jon Smirl wrote:
>> I just checked out on current Linus BK with my AGP Radeon 9000 which
>> is pretty close to a 9200. Everything is working fine.
>> 
>> I notice from his logs that he is running a PCI radeon, not an AGP
>> one. Didn't someone make some changes to the PCI radeon memory
>> management code recently? I run a PCI R128 and that is still working.
>> DRM debug output might give more clues.
>> 
> Yes, it is a PCI radeon.  And the machine has an AGP slot
> too, which is used by a matrox G550.  This AGP card was not
> used in the test, (other than being the VGA console).
> Note that there is no crash if I don't compile 
> AGP support, so the crash is related to AGP somehow even though
> AGP is not supposed to be used in this case.
> 
> As I start X (on the radeon) I notice that the VGA console 
> I'm using (on the G550 AGP) goes black.  I see no need for that either,
> the radeon display is a _different_ device so why black out 
> the vgacon?  Could the problem lurk there somehow?
> 
> Helge Hafting

I suspect it's the X server that makes your G550 go black.

XOrg-X11-6.8.2 RC1 or RC2 fixes that by introducing a VGAAccess
option for its radeon driver. I recompiled xorg-x11-6.8.1 with this
fix on my FC3 system. It made the only thing that annoyed me
using the linuxconsole.sf.net ruby patch go away.

I have a Radeon 7000VE PCI and a Radeon 9200SE AGP8x.
Every time I logged out on the first X ( localhost:0 ),
it made the other one (localhost:1) go blank.

With the above mentioned fix (that I collected from the XOrg devel
mailing list and was made by Ben Herrenschmidt) applied to XOrg
and using

Option "VGAAccess" "on"

on the card that is set up for VGA by the BIOS and

Option "VGAAccess" "off"

on the other, this problem went away. This modification disables
using the "vgahw" module in XOrg and unfortunately only applicable
to the radeon driver. BTW this patch was made for specifically
for systems that don't use vgacon, like PPC that don't even have
legacy VGA and for others that use radeonfb.

I guess the VGA routing patch and X cooperation will also
solve "the other VGA(s) in the system go blank when I fire up X"
in a generic way, not only for Radeons.

Best regards,
Zoltán Böszörményi


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2005-02-06 10:10 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <fa.ks44mbo.ljgao4@ifi.uio.no>
     [not found] ` <fa.hinb9iv.s38127@ifi.uio.no>
2005-01-22  9:40   ` 2.6.10 dies when X uses PCI radeon 9200 SE, binary search result Andreas Hartmann
2005-01-22 13:01     ` Dave Airlie
2005-01-26 10:05       ` 2.6.10 dies when X uses PCI radeon 9200 SE, further " Helge Hafting
2005-01-30 11:16         ` Helge Hafting
2005-01-30 11:22           ` Dave Airlie
2005-01-30 14:43             ` Jon Smirl
2005-01-30 15:05             ` Jon Smirl
2005-01-30 16:32               ` Helge Hafting
2005-01-30 17:05                 ` Jon Smirl
2005-01-30 21:56                   ` Helge Hafting
2005-02-04  3:43                   ` 2.6.10 dies when X uses G550 Helge Hafting
2005-02-04  4:47                     ` Jon Smirl
2005-02-04  5:57                     ` Dave Airlie
2005-02-06 10:10                       ` Dave Airlie
2005-01-30 15:07             ` 2.6.10 dies when X uses PCI radeon 9200 SE, further binary search result Jon Smirl
2005-01-30 19:00 Zoltan Boszormenyi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).