dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* Major 2.6.38 regression ignored?
@ 2011-05-20 17:06 Luke-Jr
  2011-05-20 18:08 ` Ray Lee
  0 siblings, 1 reply; 36+ messages in thread
From: Luke-Jr @ 2011-05-20 17:06 UTC (permalink / raw)
  To: intel-gfx; +Cc: dri-devel, LKML

I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month ago 
against 2.6.38. Now 2.6.39 was just released without the regression being 
addressed. This bug makes the system unusable... Some guys on IRC suggested I 
email, so here it is.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 regression ignored?
  2011-05-20 17:06 Major 2.6.38 regression ignored? Luke-Jr
@ 2011-05-20 18:08 ` Ray Lee
  2011-05-20 20:24   ` Rafael J. Wysocki
  2011-05-21  8:41   ` Chris Wilson
  0 siblings, 2 replies; 36+ messages in thread
From: Ray Lee @ 2011-05-20 18:08 UTC (permalink / raw)
  To: Luke-Jr, chris, Rafael J. Wysocki; +Cc: intel-gfx, LKML, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 477 bytes --]

[ Adding Chris Wilson (author of the problematic patch) and Rafael Wysocki
to the message ]

On Fri, May 20, 2011 at 10:06 AM, Luke-Jr <luke@dashjr.org> wrote:

> I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month ago
> against 2.6.38. Now 2.6.39 was just released without the regression being
> addressed. This bug makes the system unusable... Some guys on IRC suggested
> I
> email, so here it is.
>

See the bugzilla entry for the bisection history.

~r.

[-- Attachment #1.2: Type: text/html, Size: 842 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 regression ignored?
  2011-05-20 18:08 ` Ray Lee
@ 2011-05-20 20:24   ` Rafael J. Wysocki
  2011-05-20 21:11     ` Ray Lee
  2011-05-21  8:41   ` Chris Wilson
  1 sibling, 1 reply; 36+ messages in thread
From: Rafael J. Wysocki @ 2011-05-20 20:24 UTC (permalink / raw)
  To: Ray Lee; +Cc: Luke-Jr, chris, intel-gfx, dri-devel, LKML

On Friday, May 20, 2011, Ray Lee wrote:
> [ Adding Chris Wilson (author of the problematic patch) and Rafael Wysocki
> to the message ]

It is on the list of known regressions from 2.6.37, but we're not tracking
them any more now that 2.6.39 is out.

Thanks,
Rafael


> On Fri, May 20, 2011 at 10:06 AM, Luke-Jr <luke@dashjr.org> wrote:
> 
> > I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month ago
> > against 2.6.38. Now 2.6.39 was just released without the regression being
> > addressed. This bug makes the system unusable... Some guys on IRC suggested
> > I
> > email, so here it is.
> >
> 
> See the bugzilla entry for the bisection history.
> 
> ~r.
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 regression ignored?
  2011-05-20 20:24   ` Rafael J. Wysocki
@ 2011-05-20 21:11     ` Ray Lee
  0 siblings, 0 replies; 36+ messages in thread
From: Ray Lee @ 2011-05-20 21:11 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Luke-Jr, chris, intel-gfx, dri-devel, LKML

2011/5/20 Rafael J. Wysocki <rjw@sisk.pl>
> It is on the list of known regressions from 2.6.37, but we're not tracking
> them any more now that 2.6.39 is out.

Hopefully Chris is still tracking them, even if you aren't.

Chris? What other information can the affected person provide, or what
tests can he run to help close this out?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 regression ignored?
  2011-05-20 18:08 ` Ray Lee
  2011-05-20 20:24   ` Rafael J. Wysocki
@ 2011-05-21  8:41   ` Chris Wilson
  2011-05-21 15:23     ` Luke-Jr
  1 sibling, 1 reply; 36+ messages in thread
From: Chris Wilson @ 2011-05-21  8:41 UTC (permalink / raw)
  To: Ray Lee, Luke-Jr, Rafael J. Wysocki; +Cc: intel-gfx, dri-devel, LKML

On Fri, 20 May 2011 11:08:56 -0700, Ray Lee <ray-lk@madrabbit.org> wrote:
> [ Adding Chris Wilson (author of the problematic patch) and Rafael Wysocki
> to the message ]
> 
> On Fri, May 20, 2011 at 10:06 AM, Luke-Jr <luke@dashjr.org> wrote:
> 
> > I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month ago
> > against 2.6.38. Now 2.6.39 was just released without the regression being
> > addressed. This bug makes the system unusable... Some guys on IRC suggested
> > I
> > email, so here it is.
> >
> 
> See the bugzilla entry for the bisection history.

Which has nothing to do with Luke's bug. Considering the thousand things
that can go wrong during X starting, without a hint as to which it is nigh
on impossible to debug except by trial and error. If you set up
netconsole, does the kernel emit an OOPS with it's last dying breath?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 regression ignored?
  2011-05-21  8:41   ` Chris Wilson
@ 2011-05-21 15:23     ` Luke-Jr
  2011-05-21 15:40       ` Chris Wilson
  0 siblings, 1 reply; 36+ messages in thread
From: Luke-Jr @ 2011-05-21 15:23 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Ray Lee, Rafael J. Wysocki, intel-gfx, dri-devel, LKML

On Saturday, May 21, 2011 4:41:45 AM Chris Wilson wrote:
> On Fri, 20 May 2011 11:08:56 -0700, Ray Lee <ray-lk@madrabbit.org> wrote:
> > [ Adding Chris Wilson (author of the problematic patch) and Rafael
> > Wysocki to the message ]
> > 
> > On Fri, May 20, 2011 at 10:06 AM, Luke-Jr <luke@dashjr.org> wrote:
> > > I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month
> > > ago against 2.6.38. Now 2.6.39 was just released without the
> > > regression being addressed. This bug makes the system unusable... Some
> > > guys on IRC suggested I
> > > email, so here it is.
> > 
> > See the bugzilla entry for the bisection history.
> 
> Which has nothing to do with Luke's bug. Considering the thousand things
> that can go wrong during X starting, without a hint as to which it is nigh
> on impossible to debug except by trial and error. If you set up
> netconsole, does the kernel emit an OOPS with it's last dying breath?

Why assume it's a different bug? I would almost wonder if it might affect 
all Sandy Bridge GPUs. In any case, I no longer have the original 
motherboard (it was recalled, as I said in the first post), nor even the 
revision of it (it had other issues that weren't being fixed). I *assume* I 
will have the same problem with my new motherboard (Intel DQ67SW), but I 
haven't verified that yet. I'll be sure to try a netconsole when I have to 
reboot next and get a chance to try the most recent 2.6.38 and .39 kernels, 
but at the moment it seems reasonable to address the problem bisected in the 
bug, even if it turns out to be different.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 regression ignored?
  2011-05-21 15:23     ` Luke-Jr
@ 2011-05-21 15:40       ` Chris Wilson
  2011-05-21 19:33         ` Luke-Jr
  2011-05-28 13:19         ` Major 2.6.38 / 2.6.39 " Kirill Smelkov
  0 siblings, 2 replies; 36+ messages in thread
From: Chris Wilson @ 2011-05-21 15:40 UTC (permalink / raw)
  To: Luke-Jr; +Cc: Ray Lee, Rafael J. Wysocki, intel-gfx, LKML, dri-devel

On Sat, 21 May 2011 11:23:53 -0400, "Luke-Jr" <luke@dashjr.org> wrote:
> On Saturday, May 21, 2011 4:41:45 AM Chris Wilson wrote:
> > On Fri, 20 May 2011 11:08:56 -0700, Ray Lee <ray-lk@madrabbit.org> wrote:
> > > [ Adding Chris Wilson (author of the problematic patch) and Rafael
> > > Wysocki to the message ]
> > > 
> > > On Fri, May 20, 2011 at 10:06 AM, Luke-Jr <luke@dashjr.org> wrote:
> > > > I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month
> > > > ago against 2.6.38. Now 2.6.39 was just released without the
> > > > regression being addressed. This bug makes the system unusable... Some
> > > > guys on IRC suggested I
> > > > email, so here it is.
> > > 
> > > See the bugzilla entry for the bisection history.
> > 
> > Which has nothing to do with Luke's bug. Considering the thousand things
> > that can go wrong during X starting, without a hint as to which it is nigh
> > on impossible to debug except by trial and error. If you set up
> > netconsole, does the kernel emit an OOPS with it's last dying breath?
> 
> Why assume it's a different bug? I would almost wonder if it might affect 
> all Sandy Bridge GPUs. In any case, I no longer have the original 
> motherboard (it was recalled, as I said in the first post), nor even the 
> revision of it (it had other issues that weren't being fixed). I *assume* I 
> will have the same problem with my new motherboard (Intel DQ67SW), but I 
> haven't verified that yet. I'll be sure to try a netconsole when I have to 
> reboot next and get a chance to try the most recent 2.6.38 and .39 kernels, 
> but at the moment it seems reasonable to address the problem bisected in the 
> bug, even if it turns out to be different.

The bisection is into an old DRI1 bug on 945GM. That DRI has inadequate
locking between release and IRQ and so is prone to such races as befell
Kirill should not surprise anyone. As neither UMS nor DRI supported SNB,
I can quite confidently state they are separate bugs.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 regression ignored?
  2011-05-21 15:40       ` Chris Wilson
@ 2011-05-21 19:33         ` Luke-Jr
  2011-05-28 13:19         ` Major 2.6.38 / 2.6.39 " Kirill Smelkov
  1 sibling, 0 replies; 36+ messages in thread
From: Luke-Jr @ 2011-05-21 19:33 UTC (permalink / raw)
  To: Chris Wilson
  Cc: alsa-devel, Ray Lee, intel-gfx, LKML, dri-devel, perex,
	Rafael J. Wysocki

On Saturday, May 21, 2011 11:40:17 AM Chris Wilson wrote:
> The bisection is into an old DRI1 bug on 945GM. That DRI has inadequate
> locking between release and IRQ and so is prone to such races as befell
> Kirill should not surprise anyone. As neither UMS nor DRI supported SNB,
> I can quite confidently state they are separate bugs.

Unfortunately, I cannot help troubleshoot that bug any further, as I no longer 
have the affected motherboard. I was unable to reproduce it on my Intel 
DQ67SW.

However, I did encounter a new regression, which I have reported as:
	https://bugzilla.kernel.org/show_bug.cgi?id=35552
This one is related to Intel HD Audio, not Graphics.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 regression ignored?
  2011-05-21 15:40       ` Chris Wilson
  2011-05-21 19:33         ` Luke-Jr
@ 2011-05-28 13:19         ` Kirill Smelkov
  2011-07-12 17:17           ` [Intel-gfx] " Kirill Smelkov
  1 sibling, 1 reply; 36+ messages in thread
From: Kirill Smelkov @ 2011-05-28 13:19 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Luke-Jr, Ray Lee, Rafael J. Wysocki, intel-gfx, LKML, dri-devel

Hello Chris, everyone,

On Sat, May 21, 2011 at 04:40:17PM +0100, Chris Wilson wrote:
> On Sat, 21 May 2011 11:23:53 -0400, "Luke-Jr" <luke@dashjr.org> wrote:
> > On Saturday, May 21, 2011 4:41:45 AM Chris Wilson wrote:
> > > On Fri, 20 May 2011 11:08:56 -0700, Ray Lee <ray-lk@madrabbit.org> wrote:
> > > > [ Adding Chris Wilson (author of the problematic patch) and Rafael
> > > > Wysocki to the message ]
> > > > 
> > > > On Fri, May 20, 2011 at 10:06 AM, Luke-Jr <luke@dashjr.org> wrote:
> > > > > I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month
> > > > > ago against 2.6.38. Now 2.6.39 was just released without the
> > > > > regression being addressed. This bug makes the system unusable... Some
> > > > > guys on IRC suggested I
> > > > > email, so here it is.
> > > > 
> > > > See the bugzilla entry for the bisection history.
> > > 
> > > Which has nothing to do with Luke's bug. Considering the thousand things
> > > that can go wrong during X starting, without a hint as to which it is nigh
> > > on impossible to debug except by trial and error. If you set up
> > > netconsole, does the kernel emit an OOPS with it's last dying breath?
> > 
> > Why assume it's a different bug? I would almost wonder if it might affect 
> > all Sandy Bridge GPUs. In any case, I no longer have the original 
> > motherboard (it was recalled, as I said in the first post), nor even the 
> > revision of it (it had other issues that weren't being fixed). I *assume* I 
> > will have the same problem with my new motherboard (Intel DQ67SW), but I 
> > haven't verified that yet. I'll be sure to try a netconsole when I have to 
> > reboot next and get a chance to try the most recent 2.6.38 and .39 kernels, 
> > but at the moment it seems reasonable to address the problem bisected in the 
> > bug, even if it turns out to be different.
> 
> The bisection is into an old DRI1 bug on 945GM. That DRI has inadequate
> locking between release and IRQ and so is prone to such races as befell
> Kirill should not surprise anyone. As neither UMS nor DRI supported SNB,
> I can quite confidently state they are separate bugs.
> -Chris

I see DRI1 is maybe buggy and old, but still, pre-kms X used to work ok
on kernels < 2.6.38, and starting from 2.6.38 the system is just
unusable because X either crashes the kernel (2.6.38), or does not start
at all (2.6.39):

https://bugzilla.kernel.org/show_bug.cgi?id=36052


It's a regression. It's blocking me to upgrade to newer kernels. I've
done my homework -- digged it and came with detailed OOPS on netconsole
and bisected to single commit. Could this please be fixed?


Thanks,
Kirill

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Intel-gfx] Major 2.6.38 / 2.6.39 regression ignored?
  2011-05-28 13:19         ` Major 2.6.38 / 2.6.39 " Kirill Smelkov
@ 2011-07-12 17:17           ` Kirill Smelkov
  2011-07-12 18:07             ` Pekka Enberg
  0 siblings, 1 reply; 36+ messages in thread
From: Kirill Smelkov @ 2011-07-12 17:17 UTC (permalink / raw)
  To: Chris Wilson
  Cc: Luke-Jr, intel-gfx, LKML, dri-devel, Rafael J. Wysocki, Ray Lee

On Sat, May 28, 2011 at 05:19:20PM +0400, Kirill Smelkov wrote:
> Hello Chris, everyone,
> 
> On Sat, May 21, 2011 at 04:40:17PM +0100, Chris Wilson wrote:
> > On Sat, 21 May 2011 11:23:53 -0400, "Luke-Jr" <luke@dashjr.org> wrote:
> > > On Saturday, May 21, 2011 4:41:45 AM Chris Wilson wrote:
> > > > On Fri, 20 May 2011 11:08:56 -0700, Ray Lee <ray-lk@madrabbit.org> wrote:
> > > > > [ Adding Chris Wilson (author of the problematic patch) and Rafael
> > > > > Wysocki to the message ]
> > > > > 
> > > > > On Fri, May 20, 2011 at 10:06 AM, Luke-Jr <luke@dashjr.org> wrote:
> > > > > > I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month
> > > > > > ago against 2.6.38. Now 2.6.39 was just released without the
> > > > > > regression being addressed. This bug makes the system unusable... Some
> > > > > > guys on IRC suggested I
> > > > > > email, so here it is.
> > > > > 
> > > > > See the bugzilla entry for the bisection history.
> > > > 
> > > > Which has nothing to do with Luke's bug. Considering the thousand things
> > > > that can go wrong during X starting, without a hint as to which it is nigh
> > > > on impossible to debug except by trial and error. If you set up
> > > > netconsole, does the kernel emit an OOPS with it's last dying breath?
> > > 
> > > Why assume it's a different bug? I would almost wonder if it might affect 
> > > all Sandy Bridge GPUs. In any case, I no longer have the original 
> > > motherboard (it was recalled, as I said in the first post), nor even the 
> > > revision of it (it had other issues that weren't being fixed). I *assume* I 
> > > will have the same problem with my new motherboard (Intel DQ67SW), but I 
> > > haven't verified that yet. I'll be sure to try a netconsole when I have to 
> > > reboot next and get a chance to try the most recent 2.6.38 and .39 kernels, 
> > > but at the moment it seems reasonable to address the problem bisected in the 
> > > bug, even if it turns out to be different.
> > 
> > The bisection is into an old DRI1 bug on 945GM. That DRI has inadequate
> > locking between release and IRQ and so is prone to such races as befell
> > Kirill should not surprise anyone. As neither UMS nor DRI supported SNB,
> > I can quite confidently state they are separate bugs.
> > -Chris
> 
> I see DRI1 is maybe buggy and old, but still, pre-kms X used to work ok
> on kernels < 2.6.38, and starting from 2.6.38 the system is just
> unusable because X either crashes the kernel (2.6.38), or does not start
> at all (2.6.39):
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=36052
> 
> 
> It's a regression. It's blocking me to upgrade to newer kernels. I've
> done my homework -- digged it and came with detailed OOPS on netconsole
> and bisected to single commit. Could this please be fixed?

Silence...

Still, reverting the bisected patch helps even for 3.0:

https://bugzilla.kernel.org/show_bug.cgi?id=36052#c4

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Intel-gfx] Major 2.6.38 / 2.6.39 regression ignored?
  2011-07-12 17:17           ` [Intel-gfx] " Kirill Smelkov
@ 2011-07-12 18:07             ` Pekka Enberg
       [not found]               ` <CA+55aFxakA2U+oMJ1T7awTYa+p6xp9N0aCbfrUqgkF7BJ8gnQw@mail.gmail.com>
  0 siblings, 1 reply; 36+ messages in thread
From: Pekka Enberg @ 2011-07-12 18:07 UTC (permalink / raw)
  To: Kirill Smelkov
  Cc: Chris Wilson, Luke-Jr, intel-gfx, LKML, dri-devel,
	Rafael J. Wysocki, Ray Lee, Herbert Xu, Linus Torvalds,
	Andrew Morton

On Tue, Jul 12, 2011 at 8:17 PM, Kirill Smelkov <kirr@mns.spb.ru> wrote:
> On Sat, May 28, 2011 at 05:19:20PM +0400, Kirill Smelkov wrote:
>> Hello Chris, everyone,
>>
>> On Sat, May 21, 2011 at 04:40:17PM +0100, Chris Wilson wrote:
>> > On Sat, 21 May 2011 11:23:53 -0400, "Luke-Jr" <luke@dashjr.org> wrote:
>> > > On Saturday, May 21, 2011 4:41:45 AM Chris Wilson wrote:
>> > > > On Fri, 20 May 2011 11:08:56 -0700, Ray Lee <ray-lk@madrabbit.org> wrote:
>> > > > > [ Adding Chris Wilson (author of the problematic patch) and Rafael
>> > > > > Wysocki to the message ]
>> > > > >
>> > > > > On Fri, May 20, 2011 at 10:06 AM, Luke-Jr <luke@dashjr.org> wrote:
>> > > > > > I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month
>> > > > > > ago against 2.6.38. Now 2.6.39 was just released without the
>> > > > > > regression being addressed. This bug makes the system unusable... Some
>> > > > > > guys on IRC suggested I
>> > > > > > email, so here it is.
>> > > > >
>> > > > > See the bugzilla entry for the bisection history.
>> > > >
>> > > > Which has nothing to do with Luke's bug. Considering the thousand things
>> > > > that can go wrong during X starting, without a hint as to which it is nigh
>> > > > on impossible to debug except by trial and error. If you set up
>> > > > netconsole, does the kernel emit an OOPS with it's last dying breath?
>> > >
>> > > Why assume it's a different bug? I would almost wonder if it might affect
>> > > all Sandy Bridge GPUs. In any case, I no longer have the original
>> > > motherboard (it was recalled, as I said in the first post), nor even the
>> > > revision of it (it had other issues that weren't being fixed). I *assume* I
>> > > will have the same problem with my new motherboard (Intel DQ67SW), but I
>> > > haven't verified that yet. I'll be sure to try a netconsole when I have to
>> > > reboot next and get a chance to try the most recent 2.6.38 and .39 kernels,
>> > > but at the moment it seems reasonable to address the problem bisected in the
>> > > bug, even if it turns out to be different.
>> >
>> > The bisection is into an old DRI1 bug on 945GM. That DRI has inadequate
>> > locking between release and IRQ and so is prone to such races as befell
>> > Kirill should not surprise anyone. As neither UMS nor DRI supported SNB,
>> > I can quite confidently state they are separate bugs.
>> > -Chris
>>
>> I see DRI1 is maybe buggy and old, but still, pre-kms X used to work ok
>> on kernels < 2.6.38, and starting from 2.6.38 the system is just
>> unusable because X either crashes the kernel (2.6.38), or does not start
>> at all (2.6.39):
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=36052
>>
>>
>> It's a regression. It's blocking me to upgrade to newer kernels. I've
>> done my homework -- digged it and came with detailed OOPS on netconsole
>> and bisected to single commit. Could this please be fixed?
>
> Silence...
>
> Still, reverting the bisected patch helps even for 3.0:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=36052#c4

Keith, Chris, what's up with this regression from 2.6.38? It seems
commit e8616b6 ("drm/i915: Initialise ring vfuncs for old DRI paths")
caused problems on other machines.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
       [not found]               ` <CA+55aFxakA2U+oMJ1T7awTYa+p6xp9N0aCbfrUqgkF7BJ8gnQw@mail.gmail.com>
@ 2011-07-22 11:08                 ` Kirill Smelkov
  2011-07-22 14:12                   ` Herbert Xu
  2011-07-22 18:00                   ` Keith Packard
  0 siblings, 2 replies; 36+ messages in thread
From: Kirill Smelkov @ 2011-07-22 11:08 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Chris Wilson, Luke-Jr, intel-gfx, LKML, dri-devel,
	Rafael J. Wysocki, Ray Lee, Herbert Xu, Linus Torvalds,
	Andrew Morton, Florian Mickler, Keith Packard

 [ Cc'ing Florian Mickler and Keith Packard ]

On Tue, Jul 12, 2011 at 09:07:47PM +0300, Pekka Enberg wrote:
> On Tue, Jul 12, 2011 at 8:17 PM, Kirill Smelkov <kirr@mns.spb.ru> wrote:
> > On Sat, May 28, 2011 at 05:19:20PM +0400, Kirill Smelkov wrote:
> >> Hello Chris, everyone,
> >>
> >> On Sat, May 21, 2011 at 04:40:17PM +0100, Chris Wilson wrote:
> >> > On Sat, 21 May 2011 11:23:53 -0400, "Luke-Jr" <luke@dashjr.org> wrote:
> >> > > On Saturday, May 21, 2011 4:41:45 AM Chris Wilson wrote:
> >> > > > On Fri, 20 May 2011 11:08:56 -0700, Ray Lee <ray-lk@madrabbit.org> wrote:
> >> > > > > [ Adding Chris Wilson (author of the problematic patch) and Rafael
> >> > > > > Wysocki to the message ]
> >> > > > >
> >> > > > > On Fri, May 20, 2011 at 10:06 AM, Luke-Jr <luke@dashjr.org> wrote:
> >> > > > > > I submitted https://bugzilla.kernel.org/show_bug.cgi?id=33662 a month
> >> > > > > > ago against 2.6.38. Now 2.6.39 was just released without the
> >> > > > > > regression being addressed. This bug makes the system unusable... Some
> >> > > > > > guys on IRC suggested I
> >> > > > > > email, so here it is.
> >> > > > >
> >> > > > > See the bugzilla entry for the bisection history.
> >> > > >
> >> > > > Which has nothing to do with Luke's bug. Considering the thousand things
> >> > > > that can go wrong during X starting, without a hint as to which it is nigh
> >> > > > on impossible to debug except by trial and error. If you set up
> >> > > > netconsole, does the kernel emit an OOPS with it's last dying breath?
> >> > >
> >> > > Why assume it's a different bug? I would almost wonder if it might affect
> >> > > all Sandy Bridge GPUs. In any case, I no longer have the original
> >> > > motherboard (it was recalled, as I said in the first post), nor even the
> >> > > revision of it (it had other issues that weren't being fixed). I *assume* I
> >> > > will have the same problem with my new motherboard (Intel DQ67SW), but I
> >> > > haven't verified that yet. I'll be sure to try a netconsole when I have to
> >> > > reboot next and get a chance to try the most recent 2.6.38 and .39 kernels,
> >> > > but at the moment it seems reasonable to address the problem bisected in the
> >> > > bug, even if it turns out to be different.
> >> >
> >> > The bisection is into an old DRI1 bug on 945GM. That DRI has inadequate
> >> > locking between release and IRQ and so is prone to such races as befell
> >> > Kirill should not surprise anyone. As neither UMS nor DRI supported SNB,
> >> > I can quite confidently state they are separate bugs.
> >> > -Chris
> >>
> >> I see DRI1 is maybe buggy and old, but still, pre-kms X used to work ok
> >> on kernels < 2.6.38, and starting from 2.6.38 the system is just
> >> unusable because X either crashes the kernel (2.6.38), or does not start
> >> at all (2.6.39):
> >>
> >> https://bugzilla.kernel.org/show_bug.cgi?id=36052
> >>
> >>
> >> It's a regression. It's blocking me to upgrade to newer kernels. I've
> >> done my homework -- digged it and came with detailed OOPS on netconsole
> >> and bisected to single commit. Could this please be fixed?
> >
> > Silence...
> >
> > Still, reverting the bisected patch helps even for 3.0:
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=36052#c4
> 
> Keith, Chris, what's up with this regression from 2.6.38? It seems
> commit e8616b6 ("drm/i915: Initialise ring vfuncs for old DRI paths")
> caused problems on other machines.

Silence again, and not surprising -- I was ringing this bell for 3
months already:

https://bugzilla.kernel.org/show_bug.cgi?id=33662#c10
https://bugzilla.kernel.org/show_bug.cgi?id=36052
(and on the list)

with detailed logs and bisected single patch, without even single reply
from intel-gfx people.


And now after v3.0 is out, I've tested it again, and yes, like it was
broken on v3.0-rc5, it is (now even more) broken on v3.0 -- after first
bad io access the system freezes completely:

    On netconsole:

    # X starts here, then

    [   45.102377] ------------[ cut here ]------------
    [   45.102402] WARNING: at lib/iomap.c:43 bad_io_access+0x3d/0x40()
    [   45.102411] Hardware name: PCISA-945GSE
    [   45.102418] Bad IO access at port 0x84 (return inl(port))
    [   45.102425] Modules linked in: 
    [   45.102438] Pid: 2846, comm: sshd Not tainted 3.0.0--NAVY #33
    [   45.102445] Call Trace:
    [   45.102460]  [<c118e9fd>] ? bad_io_access+0x3d/0x40
    [   45.102473]  [<c10287ec>] warn_slowpath_common+0x6c/0xa0
    [   45.102484]  [<c118e9fd>] ? bad_io_access+0x3d/0x40
    [   45.102495]  [<c102889e>] warn_slowpath_fmt+0x2e/0x30
    [   45.102506]  [<c118e9fd>] bad_io_access+0x3d/0x40
    [   45.102516]  [<c118edb2>] ioread32+0x22/0x40
    [   45.102528]  [<c122cc7d>] i915_driver_irq_handler+0x1ad/0x660
    [   45.102541]  [<c12c6a7e>] ? rtl8169_interrupt+0xee/0x370
    [   45.102554]  [<c105c396>] handle_irq_event_percpu+0x36/0x140
    [   45.102565]  [<c105e490>] ? handle_edge_irq+0x150/0x150
    [   45.102576]  [<c105c4d9>] handle_irq_event+0x39/0x60
    [   45.102587]  [<c105e4d5>] handle_fasteoi_irq+0x45/0xd0
    [   45.102594]  <IRQ>   [<c1003c29>] ? do_IRQ+0x39/0xb0
    [   45.102613]  [<c103c9b3>] ? start_flush_work+0xc3/0x130
    [   45.102625]  [<c13bc329>] ? common_interrupt+0x29/0x30
    [   45.102636]  [<c13bc329>] ? common_interrupt+0x29/0x30
    [   45.102648]  [<c11e007b>] ? pnpacpi_encode_resources+0x37b/0x7a0
    [   45.102659]  [<c109971e>] ? fget_light+0xe/0xf0
    [   45.102671]  [<c10a8f97>] ? do_select+0x2e7/0x680
    [   45.102685]  [<c1341998>] ? sch_direct_xmit+0x58/0x1d0
    [   45.102695]  [<c10a83e0>] ? poll_freewait+0xa0/0xa0
    [   45.102706]  [<c102df37>] ? local_bh_enable+0x47/0xa0
    [   45.102718]  [<c132e371>] ? dev_queue_xmit+0x101/0x4e0
    [   45.102729]  [<c134ffba>] ? ip_finish_output+0x10a/0x2f0
    [   45.102740]  [<c1350216>] ? ip_output+0x76/0x90
    [   45.102750]  [<c134d715>] ? ip_local_out+0x65/0x70
    [   45.102762]  [<c134fa3d>] ? ip_queue_xmit+0x1bd/0x3b0
    [   45.102775]  [<c1362af8>] ? tcp_transmit_skb+0x468/0x7d0
    [   45.102788]  [<c13215af>] ? sk_reset_timer+0xf/0x20
    [   45.102798]  [<c1362446>] ? tcp_event_new_data_sent+0x86/0xc0
    [   45.102809]  [<c1364fc1>] ? tcp_write_xmit+0x1e1/0x9a0
    [   45.102822]  [<c1326925>] ? __alloc_skb+0x55/0x100
    [   45.102838]  [<c102df37>] ? local_bh_enable+0x47/0xa0
    [   45.102849]  [<c1321246>] ? release_sock+0xd6/0x110
    [   45.102859]  [<c13657f7>] ? __tcp_push_pending_frames+0x27/0x80
    [   45.102870]  [<c13584fa>] ? tcp_sendmsg+0x64a/0xac0

    -*- and then system FREEZE -*-


For completeness `X -verbose` log is in "Appendix 1", (but who cares
anyway? I've sent lots of such logs without a reply).


And again, after reverting e8616b6 ("drm/i915: Initialise ring vfuncs
for old DRI paths") on top of v3.0, X works without any problem again.


So I wonder:

    I thought people are trying to do "no regressions" rule in kernel.
    Should we then just apply the following patch? In case Intel people
    are not responding, should it just go directly into mainline?

    Or would it be more fair to say that UMS is not supported anymore,
    is broken and just remove support for it?


Thanks,
Kirill


P.S. Sometimes people change their hardware preferences based on
software support quality. Knock, knock...


From ef91a178e6069ae07c7a3c1e39e13eea609953cd Mon Sep 17 00:00:00 2001
From: Kirill Smelkov <kirr@mns.spb.ru>
Date: Wed, 29 Jun 2011 14:22:49 +0400
Subject: [PATCH] Revert "drm/i915: Initialise ring vfuncs for old DRI paths"

This reverts commit e8616b6ced6137085e6657cc63bc2fe3900b8616.

See https://bugzilla.kernel.org/show_bug.cgi?id=36052

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Florian Mickler <florian@mickler.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Keith Packard <keithp@keithp.com>
Cc: stable@kernel.org

---
 drivers/gpu/drm/i915/i915_dma.c         |   25 +++++++++++++-----
 drivers/gpu/drm/i915/intel_ringbuffer.c |   42 -------------------------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |    3 --
 3 files changed, 18 insertions(+), 52 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 296fbd6..9300d18 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -160,7 +160,7 @@ static int i915_initialize(struct drm_device * dev, drm_i915_init_t * init)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct drm_i915_master_private *master_priv = dev->primary->master->driver_priv;
-	int ret;
+	struct intel_ring_buffer *ring = LP_RING(dev_priv);
 
 	master_priv->sarea = drm_getsarea(dev);
 	if (master_priv->sarea) {
@@ -171,22 +171,33 @@ static int i915_initialize(struct drm_device * dev, drm_i915_init_t * init)
 	}
 
 	if (init->ring_size != 0) {
-		if (LP_RING(dev_priv)->obj != NULL) {
+		if (ring->obj != NULL) {
 			i915_dma_cleanup(dev);
 			DRM_ERROR("Client tried to initialize ringbuffer in "
 				  "GEM mode\n");
 			return -EINVAL;
 		}
 
-		ret = intel_render_ring_init_dri(dev,
-						 init->ring_start,
-						 init->ring_size);
-		if (ret) {
+		ring->size = init->ring_size;
+
+		ring->map.offset = init->ring_start;
+		ring->map.size = init->ring_size;
+		ring->map.type = 0;
+		ring->map.flags = 0;
+		ring->map.mtrr = 0;
+
+		drm_core_ioremap_wc(&ring->map, dev);
+
+		if (ring->map.handle == NULL) {
 			i915_dma_cleanup(dev);
-			return ret;
+			DRM_ERROR("can not ioremap virtual address for"
+				  " ring buffer\n");
+			return -ENOMEM;
 		}
 	}
 
+	ring->virtual_start = ring->map.handle;
+
 	dev_priv->cpp = init->cpp;
 	dev_priv->back_offset = init->back_offset;
 	dev_priv->front_offset = init->front_offset;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 95c4b14..8d2f610 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1304,48 +1304,6 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 	return intel_init_ring_buffer(dev, ring);
 }
 
-int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
-{
-	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
-
-	*ring = render_ring;
-	if (INTEL_INFO(dev)->gen >= 6) {
-		ring->add_request = gen6_add_request;
-		ring->irq_get = gen6_render_ring_get_irq;
-		ring->irq_put = gen6_render_ring_put_irq;
-	} else if (IS_GEN5(dev)) {
-		ring->add_request = pc_render_add_request;
-		ring->get_seqno = pc_render_get_seqno;
-	}
-
-	ring->dev = dev;
-	INIT_LIST_HEAD(&ring->active_list);
-	INIT_LIST_HEAD(&ring->request_list);
-	INIT_LIST_HEAD(&ring->gpu_write_list);
-
-	ring->size = size;
-	ring->effective_size = ring->size;
-	if (IS_I830(ring->dev))
-		ring->effective_size -= 128;
-
-	ring->map.offset = start;
-	ring->map.size = size;
-	ring->map.type = 0;
-	ring->map.flags = 0;
-	ring->map.mtrr = 0;
-
-	drm_core_ioremap_wc(&ring->map, dev);
-	if (ring->map.handle == NULL) {
-		DRM_ERROR("can not ioremap virtual address for"
-			  " ring buffer\n");
-		return -ENOMEM;
-	}
-
-	ring->virtual_start = (void __force __iomem *)ring->map.handle;
-	return 0;
-}
-
 int intel_init_bsd_ring_buffer(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 39ac2b6..b6b0fd4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -197,7 +197,4 @@ static inline void i915_trace_irq_get(struct intel_ring_buffer *ring, u32 seqno)
 		ring->trace_irq_seqno = seqno;
 }
 
-/* DRI warts */
-int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size);
-
 #endif /* _INTEL_RINGBUFFER_H_ */
-- 
1.7.6.233.gd79bc






Appendix 1. `X -verbose` log
----------------------------

# same, starting X over ssh
navy3:~# X -verbose
_XSERVTransSocketOpenCOTSServer: Unable to open socket for inet6
_XSERVTransOpen: transport open failed for inet6/navy3:0
_XSERVTransMakeAllCOTSServerListeners: failed to open listener for inet6

X.Org X Server 1.4.2
Release Date: 11 June 2008
X Protocol Version 11, Revision 0
Build Operating System: Linux Debian (xorg-server 2:1.4.2-10.lenny3)
Current Operating System: Linux navy3 3.0.0--NAVY #33 PREEMPT Fri Jul 22 13:56:40 MSD 2011 i68
6
Build Date: 25 September 2010  12:05:44PM
 
        Before reporting problems, check http://wiki.x.org
        to make sure that you have the latest version.
Module Loader present
Markers: (--) probed, (**) from config file, (==) default setting,
        (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Sat Jan 19 21:24:12 2002
(==) Using config file: "/etc/X11/xorg.conf"
(==) ServerLayout "Default Layout"
(**) |-->Screen "Default Screen" (0)
(**) |   |-->Monitor "LCD 1600x1200"
(==) No device specified for screen "Default Screen".
        Using the first device section listed.
(**) |   |-->Device "Default card"
(**) |-->Input Device "Generic Keyboard"
(**) |-->Input Device "Configured Mouse"
(==) Automatically adding devices
(==) Automatically enabling devices
(==) No FontPath specified.  Using compiled-in default.
(WW) The directory "/usr/share/fonts/X11/100dpi/" does not exist.
        Entry deleted from font path.
(WW) The directory "/usr/share/fonts/X11/75dpi/" does not exist.
        Entry deleted from font path.
(WW) The directory "/usr/share/fonts/X11/Type1" does not exist.
        Entry deleted from font path.
(WW) The directory "/usr/share/fonts/X11/100dpi" does not exist.
        Entry deleted from font path.
(WW) The directory "/usr/share/fonts/X11/75dpi" does not exist.
        Entry deleted from font path.
(WW) The directory "/var/lib/defoma/x-ttcidfont-conf.d/dirs/TrueType" does not exist.
        Entry deleted from font path.
(==) FontPath set to:
        /usr/share/fonts/X11/misc,
        /usr/share/fonts/X11/cyrillic
(==) RgbPath set to "/etc/X11/rgb"
(==) ModulePath set to "/usr/lib/xorg/modules"
(II) Loading /usr/lib/xorg/modules//libpcidata.so
(II) Module pcidata: vendor="X.Org Foundation"
        compiled for 1.4.2, module version = 1.0.0
(--) using VT number 7

(--) PCI:*(0:2:0) Intel Corporation Mobile 945GME Express Integrated Graphics Controller rev 3
, Mem @ 0xfe980000/19, 0xd0000000/28, 0xfe940000/18, I/O @ 0xbc80/3
(--) PCI: (0:2:1) Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphi
cs Controller rev 3, Mem @ 0xfe880000/19
(II) "extmod" will be loaded. This was enabled by default and also specified in the config fil
e.
(II) "dbe" will be loaded. This was enabled by default and also specified in the config file.
(II) "glx" will be loaded by default.
(II) "freetype" will be loaded. This was enabled by default and also specified in the config f
ile.
(II) "record" will be loaded. This was enabled by default and also specified in the config fil
e.
(II) "dri" will be loaded by default.
(II) Loading /usr/lib/xorg/modules/extensions//libdbe.so
(II) Module dbe: vendor="X.Org Foundation"
        compiled for 1.4.2, module version = 1.0.0
(II) Module "ddc" already built-in
(II) Loading /usr/lib/xorg/modules/extensions//libextmod.so
(II) Module extmod: vendor="X.Org Foundation"
        compiled for 1.4.2, module version = 1.0.0
(II) Loading /usr/lib/xorg/modules//fonts/libfreetype.so
(II) Module freetype: vendor="X.Org Foundation & the After X-TT Project"
        compiled for 1.4.2, module version = 2.1.0
(II) Loading /usr/lib/xorg/modules//libint10.so
(II) Module int10: vendor="X.Org Foundation"
        compiled for 1.4.2, module version = 1.0.0
(II) Loading /usr/lib/xorg/modules/extensions//librecord.so
(II) Module record: vendor="X.Org Foundation"
        compiled for 1.4.2, module version = 1.13.0
(II) Loading /usr/lib/xorg/modules//libvbe.so
(II) Module vbe: vendor="X.Org Foundation"
        compiled for 1.4.2, module version = 1.1.0
(II) Loading /usr/lib/xorg/modules/drivers//v4l_drv.so
(II) Module v4l: vendor="X.Org Foundation"
        compiled for 1.4.0.90, module version = 0.1.1
(II) Loading /usr/lib/xorg/modules/extensions//libglx.so
(II) Module glx: vendor="X.Org Foundation"
        compiled for 1.4.2, module version = 1.0.0
(==) AIGLX enabled
(II) Loading /usr/lib/xorg/modules/extensions//libdri.so
(II) Module dri: vendor="X.Org Foundation"
        compiled for 1.4.2, module version = 1.0.0
(II) Matched intel from file name intel.ids in autoconfig
(==) Matched intel for the autoconfigured driver
(==) Assigned the driver to the xf86ConfigLayout
(II) Loading /usr/lib/xorg/modules/drivers//intel_drv.so
(II) Module intel: vendor="X.Org Foundation"
        compiled for 1.4.2, module version = 2.3.2
(II) Loading /usr/lib/xorg/modules/input//kbd_drv.so
(II) Module kbd: vendor="X.Org Foundation"
        compiled for 1.4.0.90, module version = 1.3.1
(II) Loading /usr/lib/xorg/modules/input//mouse_drv.so
(II) Module mouse: vendor="X.Org Foundation"
        compiled for 1.4.0.90, module version = 1.3.0
(II) v4l driver for Video4Linux
(II) intel: Driver for Intel Integrated Graphics Chipsets: i810,
        i810-dc100, i810e, i815, i830M, 845G, 852GM/855GM, 865G, 915G,
        E7221 (i915), 915GM, 945G, 945GM, 945GME, 965G, G35, 965Q, 946GZ,
        965GM, 965GME/GLE, G33, Q35, Q33,
        Mobile Intel® GM45 Express Chipset,
        Intel Integrated Graphics Device, G45/G43, Q45/Q43, G41
(--) Assigning device section with no busID to primary device
(WW) intel: No matching Device section for instance (BusID PCI:0:2:1) found
(--) Chipset 945GME found
(II) Loading /usr/lib/xorg/modules//libvgahw.so
(II) Module vgahw: vendor="X.Org Foundation"
        compiled for 1.4.2, module version = 0.1.0
(**) intel(0): Depth 16, (--) framebuffer bpp 16
(==) intel(0): RGB weight 565
(==) intel(0): Default visual is TrueColor
(II) intel(0): Integrated Graphics Chipset: Intel(R) 945GME
(--) intel(0): Chipset: "945GME"
(--) intel(0): Linear framebuffer at 0xD0000000
(--) intel(0): IO registers at addr 0xFE980000
(II) intel(0): 2 display pipes available.
(**) intel(0): Using XAA for acceleration
(II) Module "ddc" already built-in
(II) Module "i2c" already built-in
(II) intel(0): Output VGA using monitor section LCD 1600x1200
(II) intel(0): I2C bus "CRTDDC_A" initialized.
(II) intel(0): Output LVDS has no monitor section
(II) intel(0): I2C bus "LVDSDDC_C" initialized.
(II) intel(0): Attempting to determine panel fixed mode.
(II) intel(0): I2C device "LVDSDDC_C:ddc2" registered at address 0xA0.
(II) intel(0): I2C device "LVDSDDC_C:ddc2" removed.
(II) intel(0): initializing int10
(WW) intel(0): Bad V_BIOS checksum
(II) intel(0): Primary V_BIOS segment is: 0xc000
(II) intel(0): VESA BIOS detected
(II) intel(0): I2C bus "SDVOCTRL_E for SDVOB" initialized.
(II) intel(0): I2C device "SDVOCTRL_E for SDVOB:SDVO Controller B" registered at address 0x70.
(II) intel(0): No SDVO device found on SDVOB
(II) intel(0): I2C device "SDVOCTRL_E for SDVOB:SDVO Controller B" removed.
(II) intel(0): I2C bus "SDVOCTRL_E for SDVOB" removed.
(II) intel(0): I2C bus "SDVOCTRL_E for SDVOC" initialized.
(II) intel(0): I2C device "SDVOCTRL_E for SDVOC:SDVO Controller C" registered at address 0x72.
(II) intel(0): No SDVO device found on SDVOC
(II) intel(0): I2C device "SDVOCTRL_E for SDVOC:SDVO Controller C" removed.
(II) intel(0): I2C bus "SDVOCTRL_E for SDVOC" removed.
(II) intel(0): Output TV has no monitor section
(II) intel(0): I2C device "CRTDDC_A:ddc2" registered at address 0xA0.
(II) intel(0): EDID vendor "SAM", prod id 476
(II) intel(0): Using hsync ranges from config file
(II) intel(0): Using vrefresh ranges from config file
(II) intel(0): Printing DDC gathered Modelines:
(II) intel(0): Modeline "1280x1024"x0.0  108.00  1280 1328 1440 1688  1024 1025 1028 1066 +hsy
nc +vsync (64.0 kHz)
(II) intel(0): Modeline "800x600"x0.0   40.00  800 840 968 1056  600 601 605 628 +hsync +vsync
 (37.9 kHz)
(II) intel(0): Modeline "800x600"x0.0   36.00  800 824 896 1024  600 601 603 625 +hsync +vsync
 (35.2 kHz)
(II) intel(0): Modeline "640x480"x0.0   31.50  640 656 720 840  480 481 484 500 -hsync -vsync 
(37.5 kHz)
(II) intel(0): Modeline "640x480"x0.0   31.50  640 664 704 832  480 489 491 520 -hsync -vsync 
(37.9 kHz)
(II) intel(0): Modeline "640x480"x0.0   30.24  640 704 768 864  480 483 486 525 -hsync -vsync 
(35.0 kHz)
(II) intel(0): Modeline "640x480"x0.0   25.20  640 656 752 800  480 490 492 525 -hsync -vsync 
(31.5 kHz)
(II) intel(0): Modeline "720x400"x0.0   28.32  720 738 846 900  400 412 414 449 -hsync +vsync 
(31.5 kHz)
(II) intel(0): Modeline "1280x1024"x0.0  135.00  1280 1296 1440 1688  1024 1025 1028 1066 +hsy
nc +vsync (80.0 kHz)
(II) intel(0): Modeline "1024x768"x0.0   78.80  1024 1040 1136 1312  768 769 772 800 +hsync +v
sync (60.1 kHz)
(II) intel(0): Modeline "1024x768"x0.0   75.00  1024 1048 1184 1328  768 771 777 806 -hsync -v
sync (56.5 kHz)
(II) intel(0): Modeline "1024x768"x0.0   65.00  1024 1048 1184 1344  768 771 777 806 -hsync -v
sync (48.4 kHz)
(II) intel(0): Modeline "832x624"x0.0   57.28  832 864 928 1152  624 625 628 667 -hsync -vsync
 (49.7 kHz)
(II) intel(0): Modeline "800x600"x0.0   49.50  800 816 896 1056  600 601 604 625 +hsync +vsync
 (46.9 kHz)
(II) intel(0): Modeline "800x600"x0.0   50.00  800 856 976 1040  600 637 643 666 +hsync +vsync
 (48.1 kHz)
(II) intel(0): Modeline "1152x864"x0.0  108.00  1152 1216 1344 1600  864 865 868 900 +hsync +v
sync (67.5 kHz)
(II) intel(0): Modeline "1280x1024"x59.9  109.00  1280 1368 1496 1712  1024 1027 1034 1063 -hs
ync +vsync (63.7 kHz)
(II) intel(0): Modeline "1280x960"x59.9  101.25  1280 1360 1488 1696  960 963 967 996 -hsync +
vsync (59.7 kHz)
(II) intel(0): Modeline "1152x864"x74.8  104.00  1152 1224 1344 1536  864 867 871 905 -hsync +
vsync (67.7 kHz)
(II) intel(0): EDID vendor "SAM", prod id 476
(II) intel(0): I2C device "LVDSDDC_C:ddc2" registered at address 0xA0.
(II) intel(0): I2C device "LVDSDDC_C:ddc2" removed.
(II) intel(0): Output VGA connected
(II) intel(0): Output LVDS connected
(II) intel(0): Output TV disconnected
(II) intel(0): Output VGA using initial mode 1280x1024
(II) intel(0): Output LVDS using initial mode 800x600
(II) intel(0): Monitoring connected displays enabled
(II) intel(0): detected 256 kB GTT.
(II) intel(0): detected 7932 kB stolen memory.
(==) intel(0): video overlay key set to 0x83e
(==) intel(0): Will not try to enable page flipping
(==) intel(0): Triple buffering disabled
(==) intel(0): Intel XvMC decoder disabled
(==) intel(0): Using gamma correction (1.0, 1.0, 1.0)
(**) intel(0): Display dimensions: (340, 270) mm
(**) intel(0): DPI set to (119, 150)
(II) Loading /usr/lib/xorg/modules//libfb.so
(II) Module fb: vendor="X.Org Foundation"
        compiled for 1.4.2, module version = 1.0.0
(II) Loading /usr/lib/xorg/modules//libxaa.so
(II) Module xaa: vendor="X.Org Foundation"
        compiled for 1.4.2, module version = 1.2.0
(II) Module "ramdac" already built-in
(II) intel(0): Comparing regs from server start up to After PreInit
(WW) intel(0): Register 0x61200 (PP_STATUS) changed from 0xc0000008 to 0xd000000a
(WW) intel(0): PP_STATUS before: on, ready, sequencing idle
(WW) intel(0): PP_STATUS after: on, ready, sequencing on
(WW) intel(0): Register 0x61114 (PORT_HOTPLUG_STAT) changed from 0x00000b00 to 0x00000f00
(WW) intel(0): Register 0x68000 (TV_CTL) changed from 0x10000010 to 0x000c0010
(WW) intel(0): Register 0x68010 (TV_CSC_Y) changed from 0x00000000 to 0x0332012d
(WW) intel(0): Register 0x68014 (TV_CSC_Y2) changed from 0x00000000 to 0x07d30104
(WW) intel(0): Register 0x68018 (TV_CSC_U) changed from 0x00000000 to 0x0733052d
(WW) intel(0): Register 0x6801c (TV_CSC_U2) changed from 0x00000000 to 0x05c70200
(WW) intel(0): Register 0x68020 (TV_CSC_V) changed from 0x00000000 to 0x0340030c
(WW) intel(0): Register 0x68024 (TV_CSC_V2) changed from 0x00000000 to 0x06d00200
(WW) intel(0): Register 0x68028 (TV_CLR_KNOBS) changed from 0x00000000 to 0x00606000
(WW) intel(0): Register 0x6802c (TV_CLR_LEVEL) changed from 0x00000000 to 0x010b00e1
(WW) intel(0): Register 0x68030 (TV_H_CTL_1) changed from 0x00000000 to 0x00400359
(WW) intel(0): Register 0x68034 (TV_H_CTL_2) changed from 0x00000000 to 0x80480022
(WW) intel(0): Register 0x68038 (TV_H_CTL_3) changed from 0x00000000 to 0x007c0344
(WW) intel(0): Register 0x6803c (TV_V_CTL_1) changed from 0x00000000 to 0x00f01415
(WW) intel(0): Register 0x68040 (TV_V_CTL_2) changed from 0x00000000 to 0x00060607
(WW) intel(0): Register 0x68044 (TV_V_CTL_3) changed from 0x00000000 to 0x80120001
(WW) intel(0): Register 0x68048 (TV_V_CTL_4) changed from 0x00000000 to 0x000900f0
(WW) intel(0): Register 0x6804c (TV_V_CTL_5) changed from 0x00000000 to 0x000a00f0
(WW) intel(0): Register 0x68050 (TV_V_CTL_6) changed from 0x00000000 to 0x000900f0
(WW) intel(0): Register 0x68054 (TV_V_CTL_7) changed from 0x00000000 to 0x000a00f0
(WW) intel(0): Register 0x68060 (TV_SC_CTL_1) changed from 0x00000000 to 0xc1710088
(WW) intel(0): Register 0x68064 (TV_SC_CTL_2) changed from 0x00000000 to 0x4e2d1dc8
(WW) intel(0): Register 0x68070 (TV_WIN_POS) changed from 0x00000000 to 0x00360024
(WW) intel(0): Register 0x68074 (TV_WIN_SIZE) changed from 0x00000000 to 0x02640198
(WW) intel(0): Register 0x68080 (TV_FILTER_CTL_1) changed from 0x00000000 to 0x800010bb
(WW) intel(0): Register 0x68084 (TV_FILTER_CTL_2) changed from 0x00000000 to 0x00028283
(WW) intel(0): Register 0x68088 (TV_FILTER_CTL_3) changed from 0x00000000 to 0x00014141
(WW) intel(0): Register 0x68100 (TV_H_LUMA_0) changed from 0x00000000 to 0xb1403000
(WW) intel(0): Register 0x681ec (TV_H_LUMA_59) changed from 0x00000000 to 0x0000b060
(WW) intel(0): Register 0x68200 (TV_H_CHROMA_0) changed from 0x00000000 to 0xb1403000
(WW) intel(0): Register 0x682ec (TV_H_CHROMA_59) changed from 0x00000000 to 0x0000b060
(II) intel(0): Kernel reported 107520 total, 0 used
(II) [drm] DRM interface version 1.4
(II) [drm] DRM open master succeeded.
(II) intel(0): [drm] Using the DRM lock SAREA also for drawables.
(II) intel(0): [drm] framebuffer mapped by ddx driver
(II) intel(0): [drm] added 1 reserved context for kernel
(II) intel(0): X context handle = 0x1
(II) intel(0): [drm] installed DRM signal handler
(**) intel(0): Framebuffer compression enabled
(**) intel(0): Tiling enabled
(==) intel(0): VideoRam: 262144 KB
(II) intel(0): Attempting memory allocation with tiled buffers.
(II) intel(0): Allocating 4800 scanlines for pixmap cache
(II) intel(0): Tiled allocation successful.
(II) intel(0): [drm] Registers = 0xfe980000
(II) intel(0): [drm] ring buffer = 0xd0000000
(II) intel(0): [drm] mapped front buffer at 0xd2000000, handle = 0xd2000000
(II) intel(0): [drm] mapped back buffer at 0xd0800000, handle = 0xd0800000
(II) intel(0): [drm] mapped depth buffer at 0xd1000000, handle = 0xd1000000
(II) intel(0): [drm] mapped classic textures at 0xd4000000, handle = 0xd4000000
(II) intel(0): [drm] Initialized kernel agp heap manager, 33554432
(II) intel(0): [dri] visual configs initialized
(II) intel(0): Page Flipping disabled
(==) intel(0): Write-combining range (0xd0000000,0x10000000)
(II) intel(0): Using XFree86 Acceleration Architecture (XAA)
        Screen to screen bit blits
        Solid filled rectangles
        8x8 mono pattern filled rectangles
        Indirect CPU to Screen color expansion
        Solid Horizontal and Vertical Lines
        Setting up tile and stipple cache:
                32 128x128 slots
                32 256x256 slots
                16 512x512 slots
(==) intel(0): Backing store disabled
(==) intel(0): Silken mouse enabled
(II) intel(0): Initializing HW Cursor
(II) intel(0): [DRI] installation complete
(II) intel(0): Fixed memory allocation layout:
(II) intel(0): 0x00000000-0x0001ffff: ring buffer (128 kB)
(II) intel(0): 0x00020000-0x0061ffff: compressed frame buffer (6144 kB, 0x000000001f820000 physical
)
(II) intel(0): 0x00620000-0x00620fff: compressed ll buffer (4 kB, 0x000000001fe20000 physical
)
(II) intel(0): 0x00621000-0x0062afff: HW cursors (40 kB, 0x000000001fe21000 physical
)
(II) intel(0): 0x0062b000-0x00632fff: logical 3D context (32 kB)
(II) intel(0): 0x00633000-0x00633fff: overlay registers (4 kB, 0x000000001fe33000 physical
)
(II) intel(0): 0x00634000-0x00643fff: xaa scratch (64 kB)
(II) intel(0): 0x007bf000:            end of stolen memory
(II) intel(0): 0x00800000-0x00ffffff: back buffer (6400 kB) X tiled
(II) intel(0): 0x01000000-0x017fffff: depth buffer (6400 kB) X tiled
(II) intel(0): 0x02000000-0x03ffffff: front buffer (25600 kB) X tiled
(II) intel(0): 0x04000000-0x05ffffff: classic textures (32768 kB)
(II) intel(0): 0x10000000:            end of aperture
(II) intel(0): Selecting standard 18 bit TMDS pixel format.
(II) intel(0): Output configuration:
(II) intel(0):   Pipe A is on
(II) intel(0):   Display plane A is now enabled and connected to pipe A.
(II) intel(0):   Pipe B is on
(II) intel(0):   Display plane B is now enabled and connected to pipe B.
(II) intel(0):   Output VGA is connected to pipe A
(II) intel(0):   Output LVDS is connected to pipe B
(II) intel(0):   Output TV is connected to pipe none
(II) intel(0): [drm] dma control initialized, using IRQ 16

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-07-22 11:08                 ` Major 2.6.38 / 2.6.39 / 3.0 " Kirill Smelkov
@ 2011-07-22 14:12                   ` Herbert Xu
  2011-07-22 18:00                   ` Keith Packard
  1 sibling, 0 replies; 36+ messages in thread
From: Herbert Xu @ 2011-07-22 14:12 UTC (permalink / raw)
  To: Kirill Smelkov
  Cc: penberg, chris, luke, intel-gfx, linux-kernel, dri-devel, rjw,
	ray-lk, torvalds, akpm, florian, keithp

Kirill Smelkov <kirr@mns.spb.ru> wrote:
> 
>    Or would it be more fair to say that UMS is not supported anymore,
>    is broken and just remove support for it?

It's kind of ironic that the intention of the offending patch
was to avoid a UMS crash :)

Anyway I'm now using KMS (forced to due to new hardware) so I
certainly don't have any objections to reverting this patch.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-07-22 11:08                 ` Major 2.6.38 / 2.6.39 / 3.0 " Kirill Smelkov
  2011-07-22 14:12                   ` Herbert Xu
@ 2011-07-22 18:00                   ` Keith Packard
  2011-07-22 20:23                     ` Kirill Smelkov
  1 sibling, 1 reply; 36+ messages in thread
From: Keith Packard @ 2011-07-22 18:00 UTC (permalink / raw)
  To: Kirill Smelkov, Pekka Enberg
  Cc: Chris Wilson, Luke-Jr, intel-gfx, LKML, dri-devel,
	Rafael J. Wysocki, Ray Lee, Herbert Xu, Linus Torvalds,
	Andrew Morton, Florian Mickler

[-- Attachment #1: Type: text/plain, Size: 4252 bytes --]

On Fri, 22 Jul 2011 15:08:06 +0400, Kirill Smelkov <kirr@mns.spb.ru> wrote:

> And now after v3.0 is out, I've tested it again, and yes, like it was
> broken on v3.0-rc5, it is (now even more) broken on v3.0 -- after first
> bad io access the system freezes completely:

I looked at this when I first saw it (a couple of weeks ago), and I
couldn't see any obvious reason this patch would cause this particular
problem. I didn't want to revert the patch at that point as I feared it
would cause other subtle problems. Given that you've got a work-around,
it seemed best to just push this off past 3.0.

Given the failing address passed to ioread32, this seems like it's
probably the call to READ_BREADCRUMB -- I915_BREADCRUMB_INDEX is 0x21,
which is an offset in 32-bit units within the hardware status page. If
the status_page.page_addr value was zero, then the computed address
would end up being 0x84.

And, it looks like status_page.page_addr *will* end up being zero as a
result of the patch in question. The patch resets the entire ring
structure contents back to the initial values, which includes smashing
the status_page structure to zero, clearing the value of
status_page.page_addr set in i915_init_phys_hws.

Here's an untested patch which moves the initialization of
status_page.page_addr into intel_render_ring_init_dri. I note that
intel_init_render_ring_buffer *already* has the setting of the
status_page.page_addr value, and so I've removed the setting of
status_page.page_addr from i915_init_phys_hws.

I suspect we could remove the memset from intel_init_render_ring_buffer;
it seems entirely superfluous given the memset in i915_init_phys_hws.

From 159ba1dd207fc52590ce8a3afd83f40bd2cedf46 Mon Sep 17 00:00:00 2001
From: Keith Packard <keithp@keithp.com>
Date: Fri, 22 Jul 2011 10:44:39 -0700
Subject: [PATCH] drm/i915: Initialize RCS ring status page address in
 intel_render_ring_init_dri

Physically-addressed hardware status pages are initialized early in
the driver load process by i915_init_phys_hws. For UMS environments,
the ring structure is not initialized until the X server starts. At
that point, the entire ring structure is re-initialized with all new
values. Any values set in the ring structure (including
ring->status_page.page_addr) will be lost when the ring is
re-initialized.

This patch moves the initialization of the status_page.page_addr value
to intel_render_ring_init_dri.

Signed-off-by: Keith Packard <keithp@keithp.com>
---
 drivers/gpu/drm/i915/i915_dma.c         |    6 ++----
 drivers/gpu/drm/i915/intel_ringbuffer.c |    3 +++
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 1271282..8a3942c 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -61,7 +61,6 @@ static void i915_write_hws_pga(struct drm_device *dev)
 static int i915_init_phys_hws(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct intel_ring_buffer *ring = LP_RING(dev_priv);
 
 	/* Program Hardware Status Page */
 	dev_priv->status_page_dmah =
@@ -71,10 +70,9 @@ static int i915_init_phys_hws(struct drm_device *dev)
 		DRM_ERROR("Can not allocate hardware status page\n");
 		return -ENOMEM;
 	}
-	ring->status_page.page_addr =
-		(void __force __iomem *)dev_priv->status_page_dmah->vaddr;
 
-	memset_io(ring->status_page.page_addr, 0, PAGE_SIZE);
+	memset_io((void __force __iomem *)dev_priv->status_page_dmah->vaddr,
+		  0, PAGE_SIZE);
 
 	i915_write_hws_pga(dev);
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e961568..47b9b27 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1321,6 +1321,9 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
 		ring->get_seqno = pc_render_get_seqno;
 	}
 
+	if (!I915_NEED_GFX_HWS(dev))
+		ring->status_page.page_addr = dev_priv->status_page_dmah->vaddr;
+
 	ring->dev = dev;
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
-- 
1.7.5.4

-- 
keith.packard@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-07-22 18:00                   ` Keith Packard
@ 2011-07-22 20:23                     ` Kirill Smelkov
  2011-07-22 20:50                       ` Keith Packard
  2011-07-26 13:48                       ` [Intel-gfx] " Kirill Smelkov
  0 siblings, 2 replies; 36+ messages in thread
From: Kirill Smelkov @ 2011-07-22 20:23 UTC (permalink / raw)
  To: Keith Packard
  Cc: Pekka Enberg, Chris Wilson, Luke-Jr, intel-gfx, LKML, dri-devel,
	Rafael J. Wysocki, Ray Lee, Herbert Xu, Linus Torvalds,
	Andrew Morton, Florian Mickler

Keith,

first of all thanks for your prompt reply. Then...

On Fri, Jul 22, 2011 at 11:00:41AM -0700, Keith Packard wrote:
> On Fri, 22 Jul 2011 15:08:06 +0400, Kirill Smelkov <kirr@mns.spb.ru> wrote:
> 
> > And now after v3.0 is out, I've tested it again, and yes, like it was
> > broken on v3.0-rc5, it is (now even more) broken on v3.0 -- after first
> > bad io access the system freezes completely:
> 
> I looked at this when I first saw it (a couple of weeks ago), and I
> couldn't see any obvious reason this patch would cause this particular
> problem. I didn't want to revert the patch at that point as I feared it
> would cause other subtle problems. Given that you've got a work-around,
> it seemed best to just push this off past 3.0.

What kind of a workaround are you talking about? Sorry, to me it all
looked like "UMS is being ignored forever". Anyway, let's move on to try
to solve the issue.


> Given the failing address passed to ioread32, this seems like it's
> probably the call to READ_BREADCRUMB -- I915_BREADCRUMB_INDEX is 0x21,
> which is an offset in 32-bit units within the hardware status page. If
> the status_page.page_addr value was zero, then the computed address
> would end up being 0x84.
> 
> And, it looks like status_page.page_addr *will* end up being zero as a
> result of the patch in question. The patch resets the entire ring
> structure contents back to the initial values, which includes smashing
> the status_page structure to zero, clearing the value of
> status_page.page_addr set in i915_init_phys_hws.
> 
> Here's an untested patch which moves the initialization of
> status_page.page_addr into intel_render_ring_init_dri. I note that
> intel_init_render_ring_buffer *already* has the setting of the
> status_page.page_addr value, and so I've removed the setting of
> status_page.page_addr from i915_init_phys_hws.
> 
> I suspect we could remove the memset from intel_init_render_ring_buffer;
> it seems entirely superfluous given the memset in i915_init_phys_hws.
> 
> From 159ba1dd207fc52590ce8a3afd83f40bd2cedf46 Mon Sep 17 00:00:00 2001
> From: Keith Packard <keithp@keithp.com>
> Date: Fri, 22 Jul 2011 10:44:39 -0700
> Subject: [PATCH] drm/i915: Initialize RCS ring status page address in
>  intel_render_ring_init_dri
> 
> Physically-addressed hardware status pages are initialized early in
> the driver load process by i915_init_phys_hws. For UMS environments,
> the ring structure is not initialized until the X server starts. At
> that point, the entire ring structure is re-initialized with all new
> values. Any values set in the ring structure (including
> ring->status_page.page_addr) will be lost when the ring is
> re-initialized.
> 
> This patch moves the initialization of the status_page.page_addr value
> to intel_render_ring_init_dri.
> 
> Signed-off-by: Keith Packard <keithp@keithp.com>
> ---
>  drivers/gpu/drm/i915/i915_dma.c         |    6 ++----
>  drivers/gpu/drm/i915/intel_ringbuffer.c |    3 +++
>  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 1271282..8a3942c 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -61,7 +61,6 @@ static void i915_write_hws_pga(struct drm_device *dev)
>  static int i915_init_phys_hws(struct drm_device *dev)
>  {
>  	drm_i915_private_t *dev_priv = dev->dev_private;
> -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
>  
>  	/* Program Hardware Status Page */
>  	dev_priv->status_page_dmah =
> @@ -71,10 +70,9 @@ static int i915_init_phys_hws(struct drm_device *dev)
>  		DRM_ERROR("Can not allocate hardware status page\n");
>  		return -ENOMEM;
>  	}
> -	ring->status_page.page_addr =
> -		(void __force __iomem *)dev_priv->status_page_dmah->vaddr;
>  
> -	memset_io(ring->status_page.page_addr, 0, PAGE_SIZE);
> +	memset_io((void __force __iomem *)dev_priv->status_page_dmah->vaddr,
> +		  0, PAGE_SIZE);
>  
>  	i915_write_hws_pga(dev);
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index e961568..47b9b27 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1321,6 +1321,9 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
>  		ring->get_seqno = pc_render_get_seqno;
>  	}
>  
> +	if (!I915_NEED_GFX_HWS(dev))
> +		ring->status_page.page_addr = dev_priv->status_page_dmah->vaddr;
> +
>  	ring->dev = dev;
>  	INIT_LIST_HEAD(&ring->active_list);
>  	INIT_LIST_HEAD(&ring->request_list);

I can't tell whether this is correct, because intel gfx driver is
unknown to me, but from the first glance your description sounds reasonable.

I'm out of office till ~ next week's tuesday, and on return I'll try
to test it on the hardware in question.


Thanks again,
Kirill

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-07-22 20:23                     ` Kirill Smelkov
@ 2011-07-22 20:50                       ` Keith Packard
  2011-07-22 21:08                         ` Kirill Smelkov
  2011-07-23 15:55                         ` Pekka Enberg
  2011-07-26 13:48                       ` [Intel-gfx] " Kirill Smelkov
  1 sibling, 2 replies; 36+ messages in thread
From: Keith Packard @ 2011-07-22 20:50 UTC (permalink / raw)
  To: Kirill Smelkov
  Cc: Pekka Enberg, Chris Wilson, Luke-Jr, intel-gfx, LKML, dri-devel,
	Rafael J. Wysocki, Ray Lee, Herbert Xu, Linus Torvalds,
	Andrew Morton, Florian Mickler

[-- Attachment #1: Type: text/plain, Size: 991 bytes --]

On Sat, 23 Jul 2011 00:23:36 +0400, Kirill Smelkov <kirr@mns.spb.ru> wrote:

> What kind of a workaround are you talking about?

Just reverting the commit -- that makes your machine work, even if it's
wrong for other machines.

> Sorry, to me it all looked like "UMS is being ignored forever".

You're right, of course -- UMS is a huge wart on the kernel driver at
this point, keeping it working while also adding new functionality
continues to cause challenges. We tend to expect that most people will
run reasonably contemporaneous kernel and user space code, and so three
years after the switch, it continues to surprise us when someone
actually tries UMS.

> I'm out of office till ~ next week's tuesday, and on return I'll try
> to test it on the hardware in question.

Let me know; I've pushed this patch to my drm-intel-fixes tree on
kernel.org in the meantime; if it does solve the problem, I'd like to
add your Tested-by: line.

-- 
keith.packard@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-07-22 20:50                       ` Keith Packard
@ 2011-07-22 21:08                         ` Kirill Smelkov
  2011-07-22 21:31                           ` Kirill Smelkov
  2011-07-23 15:55                         ` Pekka Enberg
  1 sibling, 1 reply; 36+ messages in thread
From: Kirill Smelkov @ 2011-07-22 21:08 UTC (permalink / raw)
  To: Keith Packard
  Cc: Rafael J. Wysocki, Herbert Xu, Luke-Jr, intel-gfx, LKML,
	dri-devel, Pekka Enberg, Ray Lee, Andrew Morton, Linus Torvalds

On Fri, Jul 22, 2011 at 01:50:04PM -0700, Keith Packard wrote:
> On Sat, 23 Jul 2011 00:23:36 +0400, Kirill Smelkov <kirr@mns.spb.ru> wrote:
> 
> > What kind of a workaround are you talking about?
> 
> Just reverting the commit -- that makes your machine work, even if it's
> wrong for other machines.

Yes, I could revert it. But since the driver is reasonably complex, it
is better to know what I'm doing and that the change makes sense,
especially when it's not "my machine", but lots of target boards located
all over the country.

That's why I wanted, and imho reasonably, because I did the homework,
your feedback - to be not on my own, alone.


> > Sorry, to me it all looked like "UMS is being ignored forever".
> 
> You're right, of course -- UMS is a huge wart on the kernel driver at
> this point, keeping it working while also adding new functionality
> continues to cause challenges. We tend to expect that most people will
> run reasonably contemporaneous kernel and user space code, and so three
> years after the switch, it continues to surprise us when someone
> actually tries UMS.

We are planning upgrade to KMS too. The kernel is upgraded more often
compared to userspace, because of already mentioned (thanks!) "no
regression" rule. Userspace is more complex and more work in my context,
so it is lagging, but eventually we'll get there.

So I hope some day, when everyone upgrades, UMS support could be
"cleaned up" out from the driver.


> > I'm out of office till ~ next week's tuesday, and on return I'll try
> > to test it on the hardware in question.
> 
> Let me know; I've pushed this patch to my drm-intel-fixes tree on
> kernel.org in the meantime; if it does solve the problem, I'd like to
> add your Tested-by: line.

Yes, sure, I'll let you know the results.


Thanks,
Kirill

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-07-22 21:08                         ` Kirill Smelkov
@ 2011-07-22 21:31                           ` Kirill Smelkov
  2011-07-23 15:10                             ` [Intel-gfx] " Alex Deucher
  0 siblings, 1 reply; 36+ messages in thread
From: Kirill Smelkov @ 2011-07-22 21:31 UTC (permalink / raw)
  To: Keith Packard
  Cc: Pekka Enberg, Herbert Xu, Luke-Jr, intel-gfx, LKML, dri-devel,
	Rafael J. Wysocki, Ray Lee, Andrew Morton, Linus Torvalds

On Sat, Jul 23, 2011 at 01:08:14AM +0400, Kirill Smelkov wrote:
> On Fri, Jul 22, 2011 at 01:50:04PM -0700, Keith Packard wrote:

> > You're right, of course -- UMS is a huge wart on the kernel driver at
> > this point, keeping it working while also adding new functionality
> > continues to cause challenges. We tend to expect that most people will
> > run reasonably contemporaneous kernel and user space code, and so three
> > years after the switch, it continues to surprise us when someone
> > actually tries UMS.
> 
> We are planning upgrade to KMS too. The kernel is upgraded more often
> compared to userspace, because of already mentioned (thanks!) "no
> regression" rule. Userspace is more complex and more work in my context,
> so it is lagging, but eventually we'll get there.

Also wanted to say, that if whole X could be built, like the kernel, from one
repo without multirepo-setup tool, with 100% reliable working
incremental rebuild, etc... it would be a bit easier to upgrade X too.

Sorry for being a bit offtopic, could not resist. I was keeping that
though in my head for ~ 2 years already, and now had a chance to mention it.



Thanks,
Kirill

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Intel-gfx] Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-07-22 21:31                           ` Kirill Smelkov
@ 2011-07-23 15:10                             ` Alex Deucher
  2011-07-23 18:19                               ` Kirill Smelkov
  0 siblings, 1 reply; 36+ messages in thread
From: Alex Deucher @ 2011-07-23 15:10 UTC (permalink / raw)
  To: Kirill Smelkov
  Cc: Rafael J. Wysocki, Herbert Xu, Luke-Jr, intel-gfx, LKML,
	dri-devel, Pekka Enberg, Ray Lee, Andrew Morton, Linus Torvalds

On Fri, Jul 22, 2011 at 5:31 PM, Kirill Smelkov <kirr@mns.spb.ru> wrote:
> On Sat, Jul 23, 2011 at 01:08:14AM +0400, Kirill Smelkov wrote:
>> On Fri, Jul 22, 2011 at 01:50:04PM -0700, Keith Packard wrote:
>
>> > You're right, of course -- UMS is a huge wart on the kernel driver at
>> > this point, keeping it working while also adding new functionality
>> > continues to cause challenges. We tend to expect that most people will
>> > run reasonably contemporaneous kernel and user space code, and so three
>> > years after the switch, it continues to surprise us when someone
>> > actually tries UMS.
>>
>> We are planning upgrade to KMS too. The kernel is upgraded more often
>> compared to userspace, because of already mentioned (thanks!) "no
>> regression" rule. Userspace is more complex and more work in my context,
>> so it is lagging, but eventually we'll get there.
>
> Also wanted to say, that if whole X could be built, like the kernel, from one
> repo without multirepo-setup tool, with 100% reliable working
> incremental rebuild, etc... it would be a bit easier to upgrade X too.
>
> Sorry for being a bit offtopic, could not resist. I was keeping that
> though in my head for ~ 2 years already, and now had a chance to mention it.

You don't have to rebuild all of X to use KMS.  In most cases, you
just need to update the ddx for your card.

Alex

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-07-22 20:50                       ` Keith Packard
  2011-07-22 21:08                         ` Kirill Smelkov
@ 2011-07-23 15:55                         ` Pekka Enberg
  2011-07-25  4:29                           ` Keith Packard
  1 sibling, 1 reply; 36+ messages in thread
From: Pekka Enberg @ 2011-07-23 15:55 UTC (permalink / raw)
  To: Keith Packard
  Cc: Kirill Smelkov, Chris Wilson, Luke-Jr, intel-gfx, LKML,
	dri-devel, Rafael J. Wysocki, Ray Lee, Herbert Xu,
	Linus Torvalds, Andrew Morton, Florian Mickler

Hi Keith,

On Fri, 22 Jul 2011, Keith Packard wrote:
>> Sorry, to me it all looked like "UMS is being ignored forever".
>
> You're right, of course -- UMS is a huge wart on the kernel driver at
> this point, keeping it working while also adding new functionality
> continues to cause challenges. We tend to expect that most people will
> run reasonably contemporaneous kernel and user space code, and so three
> years after the switch, it continues to surprise us when someone
> actually tries UMS.

I know I sound like a broken record but I really wish you i915 devs were 
little more eager to revert broken patches early rather than late. I mean, 
this particular breakage was already bisected but nobody said or 
did anything - and it's not like it's the first time either!

I suppose I need to bribe Linus somehow to be more strict with you folks.

 			Pekka

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-07-23 15:10                             ` [Intel-gfx] " Alex Deucher
@ 2011-07-23 18:19                               ` Kirill Smelkov
  0 siblings, 0 replies; 36+ messages in thread
From: Kirill Smelkov @ 2011-07-23 18:19 UTC (permalink / raw)
  To: Alex Deucher
  Cc: Rafael J. Wysocki, Herbert Xu, Luke-Jr, intel-gfx, LKML,
	dri-devel, Pekka Enberg, Ray Lee, Andrew Morton, Linus Torvalds

On Sat, Jul 23, 2011 at 11:10:53AM -0400, Alex Deucher wrote:
> On Fri, Jul 22, 2011 at 5:31 PM, Kirill Smelkov <kirr@mns.spb.ru> wrote:
> > On Sat, Jul 23, 2011 at 01:08:14AM +0400, Kirill Smelkov wrote:
> >> On Fri, Jul 22, 2011 at 01:50:04PM -0700, Keith Packard wrote:
> >
> >> > You're right, of course -- UMS is a huge wart on the kernel driver at
> >> > this point, keeping it working while also adding new functionality
> >> > continues to cause challenges. We tend to expect that most people will
> >> > run reasonably contemporaneous kernel and user space code, and so three
> >> > years after the switch, it continues to surprise us when someone
> >> > actually tries UMS.
> >>
> >> We are planning upgrade to KMS too. The kernel is upgraded more often
> >> compared to userspace, because of already mentioned (thanks!) "no
> >> regression" rule. Userspace is more complex and more work in my context,
> >> so it is lagging, but eventually we'll get there.
> >
> > Also wanted to say, that if whole X could be built, like the kernel, from one
> > repo without multirepo-setup tool, with 100% reliable working
> > incremental rebuild, etc... it would be a bit easier to upgrade X too.
> >
> > Sorry for being a bit offtopic, could not resist. I was keeping that
> > though in my head for ~ 2 years already, and now had a chance to mention it.
> 
> You don't have to rebuild all of X to use KMS.  In most cases, you
> just need to update the ddx for your card.

I meant the rebuilt not to use KMS, but general case. To me the kernel
has one of the great advantage of being lots of self-consistent code
because of being maintained in one repo + good build system + good
development process. And as the result it is (relatively) easy to
upgrade.

Anyway, this is just a note from both kernel and X stranger, so
whatever...


Kirill

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-07-23 15:55                         ` Pekka Enberg
@ 2011-07-25  4:29                           ` Keith Packard
  0 siblings, 0 replies; 36+ messages in thread
From: Keith Packard @ 2011-07-25  4:29 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Kirill Smelkov, Chris Wilson, Luke-Jr, intel-gfx, LKML,
	dri-devel, Rafael J. Wysocki, Ray Lee, Herbert Xu,
	Linus Torvalds, Andrew Morton, Florian Mickler

[-- Attachment #1: Type: text/plain, Size: 1008 bytes --]

On Sat, 23 Jul 2011 18:55:48 +0300 (EEST), Pekka Enberg <penberg@kernel.org> wrote:

> I know I sound like a broken record but I really wish you i915 devs were 
> little more eager to revert broken patches early rather than late. I mean, 
> this particular breakage was already bisected but nobody said or 
> did anything - and it's not like it's the first time either!

We've switched processes starting with 2.6.39 and I think we're doing
better in this regard. For this particular issue, the regression came
with 2.6.38, and the revert was too large for me to consider merging
just before 3.0 shipped -- I knew reverting it *would* cause problems
for anyone using UMS on newer hardware.

> I suppose I need to bribe Linus somehow to be more strict with you
> folks.

He nicely delivered the message for you a few months ago in person.

In any case, I'm hoping that my smaller fix will resolve the problem and
also not cause regressions for other users.

-- 
keith.packard@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Intel-gfx] Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-07-22 20:23                     ` Kirill Smelkov
  2011-07-22 20:50                       ` Keith Packard
@ 2011-07-26 13:48                       ` Kirill Smelkov
  2011-08-09 12:08                         ` Kirill Smelkov
  1 sibling, 1 reply; 36+ messages in thread
From: Kirill Smelkov @ 2011-07-26 13:48 UTC (permalink / raw)
  To: Keith Packard
  Cc: Rafael J. Wysocki, Herbert Xu, Luke-Jr, intel-gfx, LKML,
	dri-devel, Pekka Enberg, Ray Lee, Andrew Morton, Linus Torvalds

On Sat, Jul 23, 2011 at 12:23:36AM +0400, Kirill Smelkov wrote:
> Keith,
> 
> first of all thanks for your prompt reply. Then...
> 
> On Fri, Jul 22, 2011 at 11:00:41AM -0700, Keith Packard wrote:
> > On Fri, 22 Jul 2011 15:08:06 +0400, Kirill Smelkov <kirr@mns.spb.ru> wrote:
> > 
> > > And now after v3.0 is out, I've tested it again, and yes, like it was
> > > broken on v3.0-rc5, it is (now even more) broken on v3.0 -- after first
> > > bad io access the system freezes completely:
> > 
> > I looked at this when I first saw it (a couple of weeks ago), and I
> > couldn't see any obvious reason this patch would cause this particular
> > problem. I didn't want to revert the patch at that point as I feared it
> > would cause other subtle problems. Given that you've got a work-around,
> > it seemed best to just push this off past 3.0.
> 
> What kind of a workaround are you talking about? Sorry, to me it all
> looked like "UMS is being ignored forever". Anyway, let's move on to try
> to solve the issue.
> 
> 
> > Given the failing address passed to ioread32, this seems like it's
> > probably the call to READ_BREADCRUMB -- I915_BREADCRUMB_INDEX is 0x21,
> > which is an offset in 32-bit units within the hardware status page. If
> > the status_page.page_addr value was zero, then the computed address
> > would end up being 0x84.
> > 
> > And, it looks like status_page.page_addr *will* end up being zero as a
> > result of the patch in question. The patch resets the entire ring
> > structure contents back to the initial values, which includes smashing
> > the status_page structure to zero, clearing the value of
> > status_page.page_addr set in i915_init_phys_hws.
> > 
> > Here's an untested patch which moves the initialization of
> > status_page.page_addr into intel_render_ring_init_dri. I note that
> > intel_init_render_ring_buffer *already* has the setting of the
> > status_page.page_addr value, and so I've removed the setting of
> > status_page.page_addr from i915_init_phys_hws.
> > 
> > I suspect we could remove the memset from intel_init_render_ring_buffer;
> > it seems entirely superfluous given the memset in i915_init_phys_hws.
> > 
> > From 159ba1dd207fc52590ce8a3afd83f40bd2cedf46 Mon Sep 17 00:00:00 2001
> > From: Keith Packard <keithp@keithp.com>
> > Date: Fri, 22 Jul 2011 10:44:39 -0700
> > Subject: [PATCH] drm/i915: Initialize RCS ring status page address in
> >  intel_render_ring_init_dri
> > 
> > Physically-addressed hardware status pages are initialized early in
> > the driver load process by i915_init_phys_hws. For UMS environments,
> > the ring structure is not initialized until the X server starts. At
> > that point, the entire ring structure is re-initialized with all new
> > values. Any values set in the ring structure (including
> > ring->status_page.page_addr) will be lost when the ring is
> > re-initialized.
> > 
> > This patch moves the initialization of the status_page.page_addr value
> > to intel_render_ring_init_dri.
> > 
> > Signed-off-by: Keith Packard <keithp@keithp.com>
> > ---
> >  drivers/gpu/drm/i915/i915_dma.c         |    6 ++----
> >  drivers/gpu/drm/i915/intel_ringbuffer.c |    3 +++
> >  2 files changed, 5 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 1271282..8a3942c 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -61,7 +61,6 @@ static void i915_write_hws_pga(struct drm_device *dev)
> >  static int i915_init_phys_hws(struct drm_device *dev)
> >  {
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
> >  
> >  	/* Program Hardware Status Page */
> >  	dev_priv->status_page_dmah =
> > @@ -71,10 +70,9 @@ static int i915_init_phys_hws(struct drm_device *dev)
> >  		DRM_ERROR("Can not allocate hardware status page\n");
> >  		return -ENOMEM;
> >  	}
> > -	ring->status_page.page_addr =
> > -		(void __force __iomem *)dev_priv->status_page_dmah->vaddr;
> >  
> > -	memset_io(ring->status_page.page_addr, 0, PAGE_SIZE);
> > +	memset_io((void __force __iomem *)dev_priv->status_page_dmah->vaddr,
> > +		  0, PAGE_SIZE);
> >  
> >  	i915_write_hws_pga(dev);
> >  
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index e961568..47b9b27 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -1321,6 +1321,9 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
> >  		ring->get_seqno = pc_render_get_seqno;
> >  	}
> >  
> > +	if (!I915_NEED_GFX_HWS(dev))
> > +		ring->status_page.page_addr = dev_priv->status_page_dmah->vaddr;
> > +
> >  	ring->dev = dev;
> >  	INIT_LIST_HEAD(&ring->active_list);
> >  	INIT_LIST_HEAD(&ring->request_list);
> 
> I can't tell whether this is correct, because intel gfx driver is
> unknown to me, but from the first glance your description sounds reasonable.
> 
> I'm out of office till ~ next week's tuesday, and on return I'll try
> to test it on the hardware in question.

Keith, thanks again for the patch. As promised I've tested it on the
hardware in question and yes, bad_access is gone and X seems to work,
so thank you, but...


I see there are more such bugs in introduced-in-guilty-patch
intel_render_ring_init_dri(). For example ring->irq_queue is
left uninitialized and also ring->irq_lock etc...

I'm X newbie, so if here is something stupid X-wise, please don't
beat me too hard, but to me the gist of the problem is the original
patch, where Chris does

( git show e8616b6ced6137085e6657cc63bc2fe3900b8616 )
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 03e3370..51fbc5e 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1291,6 +1291,48 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
>         return intel_init_ring_buffer(dev, ring);
>  }
>  
> +int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
> +{
> +       drm_i915_private_t *dev_priv = dev->dev_private;
> +       struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> +
> +       *ring = render_ring;
          ^^^^^^^^^^^^^^^^^^^
          here resets

> +       if (INTEL_INFO(dev)->gen >= 6) {
> +               ring->add_request = gen6_add_request;
> +               ring->irq_get = gen6_render_ring_get_irq;
> +               ring->irq_put = gen6_render_ring_put_irq;
> +       } else if (IS_GEN5(dev)) {
> +               ring->add_request = pc_render_add_request;
> +               ring->get_seqno = pc_render_get_seqno;
> +       }

and then the rest of the `ring` is initialized seemingly copy-pasted
from intel_init_ring_buffer():

> +       ring->dev = dev;
> +       INIT_LIST_HEAD(&ring->active_list);
> +       INIT_LIST_HEAD(&ring->request_list);
> +       INIT_LIST_HEAD(&ring->gpu_write_list);
> +
> +       ring->size = size;
> +       ring->effective_size = ring->size;
> +       if (IS_I830(ring->dev))
> +               ring->effective_size -= 128;
> +
> +       ring->map.offset = start;
> +       ring->map.size = size;
> +       ring->map.type = 0;
> +       ring->map.flags = 0;
> +       ring->map.mtrr = 0;
...

where both 3 chunks go almost exactly from intel_init_ring_buffer(), and
ring->effective_size tweak even stripped original comment:

# original version from intel_init_ring_buffer():
        /* Workaround an erratum on the i830 which causes a hang if
         * the TAIL pointer points to within the last 2 cachelines
         * of the buffer.
         */
        ring->effective_size = ring->size;
        if (IS_I830(ring->dev))
                ring->effective_size -= 128;

...


The line marked "here resets" resets all the fields, and maybe it's not a good
idea to re-initialize them all afterwards (missing some as this thread show),
or at least if it is really needed, share initialization code between
intel_render_ring_init_dri() and intel_init_ring_buffer() ?

>From the outside it looks like the offending patch was done as a quick
fix in a hurry (lots of copy-paste), and maybe it would be better to
re-do it properly...


Thanks again,
Kirill

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-07-26 13:48                       ` [Intel-gfx] " Kirill Smelkov
@ 2011-08-09 12:08                         ` Kirill Smelkov
  2011-08-09 14:00                           ` Vasily Khoruzhick
  0 siblings, 1 reply; 36+ messages in thread
From: Kirill Smelkov @ 2011-08-09 12:08 UTC (permalink / raw)
  To: Keith Packard
  Cc: Rafael J. Wysocki, Herbert Xu, Luke-Jr, intel-gfx, LKML,
	dri-devel, Pekka Enberg, Ray Lee, Andrew Morton, Linus Torvalds

On Tue, Jul 26, 2011 at 05:48:27PM +0400, Kirill Smelkov wrote:
> On Sat, Jul 23, 2011 at 12:23:36AM +0400, Kirill Smelkov wrote:
> > Keith,
> > 
> > first of all thanks for your prompt reply. Then...
> > 
> > On Fri, Jul 22, 2011 at 11:00:41AM -0700, Keith Packard wrote:
> > > On Fri, 22 Jul 2011 15:08:06 +0400, Kirill Smelkov <kirr@mns.spb.ru> wrote:
> > > 
> > > > And now after v3.0 is out, I've tested it again, and yes, like it was
> > > > broken on v3.0-rc5, it is (now even more) broken on v3.0 -- after first
> > > > bad io access the system freezes completely:
> > > 
> > > I looked at this when I first saw it (a couple of weeks ago), and I
> > > couldn't see any obvious reason this patch would cause this particular
> > > problem. I didn't want to revert the patch at that point as I feared it
> > > would cause other subtle problems. Given that you've got a work-around,
> > > it seemed best to just push this off past 3.0.
> > 
> > What kind of a workaround are you talking about? Sorry, to me it all
> > looked like "UMS is being ignored forever". Anyway, let's move on to try
> > to solve the issue.
> > 
> > 
> > > Given the failing address passed to ioread32, this seems like it's
> > > probably the call to READ_BREADCRUMB -- I915_BREADCRUMB_INDEX is 0x21,
> > > which is an offset in 32-bit units within the hardware status page. If
> > > the status_page.page_addr value was zero, then the computed address
> > > would end up being 0x84.
> > > 
> > > And, it looks like status_page.page_addr *will* end up being zero as a
> > > result of the patch in question. The patch resets the entire ring
> > > structure contents back to the initial values, which includes smashing
> > > the status_page structure to zero, clearing the value of
> > > status_page.page_addr set in i915_init_phys_hws.
> > > 
> > > Here's an untested patch which moves the initialization of
> > > status_page.page_addr into intel_render_ring_init_dri. I note that
> > > intel_init_render_ring_buffer *already* has the setting of the
> > > status_page.page_addr value, and so I've removed the setting of
> > > status_page.page_addr from i915_init_phys_hws.
> > > 
> > > I suspect we could remove the memset from intel_init_render_ring_buffer;
> > > it seems entirely superfluous given the memset in i915_init_phys_hws.
> > > 
> > > From 159ba1dd207fc52590ce8a3afd83f40bd2cedf46 Mon Sep 17 00:00:00 2001
> > > From: Keith Packard <keithp@keithp.com>
> > > Date: Fri, 22 Jul 2011 10:44:39 -0700
> > > Subject: [PATCH] drm/i915: Initialize RCS ring status page address in
> > >  intel_render_ring_init_dri
> > > 
> > > Physically-addressed hardware status pages are initialized early in
> > > the driver load process by i915_init_phys_hws. For UMS environments,
> > > the ring structure is not initialized until the X server starts. At
> > > that point, the entire ring structure is re-initialized with all new
> > > values. Any values set in the ring structure (including
> > > ring->status_page.page_addr) will be lost when the ring is
> > > re-initialized.
> > > 
> > > This patch moves the initialization of the status_page.page_addr value
> > > to intel_render_ring_init_dri.
> > > 
> > > Signed-off-by: Keith Packard <keithp@keithp.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_dma.c         |    6 ++----
> > >  drivers/gpu/drm/i915/intel_ringbuffer.c |    3 +++
> > >  2 files changed, 5 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > > index 1271282..8a3942c 100644
> > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > @@ -61,7 +61,6 @@ static void i915_write_hws_pga(struct drm_device *dev)
> > >  static int i915_init_phys_hws(struct drm_device *dev)
> > >  {
> > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
> > >  
> > >  	/* Program Hardware Status Page */
> > >  	dev_priv->status_page_dmah =
> > > @@ -71,10 +70,9 @@ static int i915_init_phys_hws(struct drm_device *dev)
> > >  		DRM_ERROR("Can not allocate hardware status page\n");
> > >  		return -ENOMEM;
> > >  	}
> > > -	ring->status_page.page_addr =
> > > -		(void __force __iomem *)dev_priv->status_page_dmah->vaddr;
> > >  
> > > -	memset_io(ring->status_page.page_addr, 0, PAGE_SIZE);
> > > +	memset_io((void __force __iomem *)dev_priv->status_page_dmah->vaddr,
> > > +		  0, PAGE_SIZE);
> > >  
> > >  	i915_write_hws_pga(dev);
> > >  
> > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > index e961568..47b9b27 100644
> > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > @@ -1321,6 +1321,9 @@ int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
> > >  		ring->get_seqno = pc_render_get_seqno;
> > >  	}
> > >  
> > > +	if (!I915_NEED_GFX_HWS(dev))
> > > +		ring->status_page.page_addr = dev_priv->status_page_dmah->vaddr;
> > > +
> > >  	ring->dev = dev;
> > >  	INIT_LIST_HEAD(&ring->active_list);
> > >  	INIT_LIST_HEAD(&ring->request_list);
> > 
> > I can't tell whether this is correct, because intel gfx driver is
> > unknown to me, but from the first glance your description sounds reasonable.
> > 
> > I'm out of office till ~ next week's tuesday, and on return I'll try
> > to test it on the hardware in question.
> 
> Keith, thanks again for the patch. As promised I've tested it on the
> hardware in question and yes, bad_access is gone and X seems to work,
> so thank you, but...
> 
> 
> I see there are more such bugs in introduced-in-guilty-patch
> intel_render_ring_init_dri(). For example ring->irq_queue is
> left uninitialized and also ring->irq_lock etc...
>
>
> I'm X newbie, so if here is something stupid X-wise, please don't
> beat me too hard, but to me the gist of the problem is the original
> patch, where Chris does
> 
> ( git show e8616b6ced6137085e6657cc63bc2fe3900b8616 )
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 03e3370..51fbc5e 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -1291,6 +1291,48 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
> >         return intel_init_ring_buffer(dev, ring);
> >  }
> >  
> > +int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
> > +{
> > +       drm_i915_private_t *dev_priv = dev->dev_private;
> > +       struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> > +
> > +       *ring = render_ring;
>           ^^^^^^^^^^^^^^^^^^^
>           here resets
> 
> > +       if (INTEL_INFO(dev)->gen >= 6) {
> > +               ring->add_request = gen6_add_request;
> > +               ring->irq_get = gen6_render_ring_get_irq;
> > +               ring->irq_put = gen6_render_ring_put_irq;
> > +       } else if (IS_GEN5(dev)) {
> > +               ring->add_request = pc_render_add_request;
> > +               ring->get_seqno = pc_render_get_seqno;
> > +       }
> 
> and then the rest of the `ring` is initialized seemingly copy-pasted
> from intel_init_ring_buffer():
> 
> > +       ring->dev = dev;
> > +       INIT_LIST_HEAD(&ring->active_list);
> > +       INIT_LIST_HEAD(&ring->request_list);
> > +       INIT_LIST_HEAD(&ring->gpu_write_list);
> > +
> > +       ring->size = size;
> > +       ring->effective_size = ring->size;
> > +       if (IS_I830(ring->dev))
> > +               ring->effective_size -= 128;
> > +
> > +       ring->map.offset = start;
> > +       ring->map.size = size;
> > +       ring->map.type = 0;
> > +       ring->map.flags = 0;
> > +       ring->map.mtrr = 0;
> ...
> 
> where both 3 chunks go almost exactly from intel_init_ring_buffer(), and
> ring->effective_size tweak even stripped original comment:
> 
> # original version from intel_init_ring_buffer():
>         /* Workaround an erratum on the i830 which causes a hang if
>          * the TAIL pointer points to within the last 2 cachelines
>          * of the buffer.
>          */
>         ring->effective_size = ring->size;
>         if (IS_I830(ring->dev))
>                 ring->effective_size -= 128;
> 
> ...
> 
> 
> The line marked "here resets" resets all the fields, and maybe it's not a good
> idea to re-initialize them all afterwards (missing some as this thread show),
> or at least if it is really needed, share initialization code between
> intel_render_ring_init_dri() and intel_init_ring_buffer() ?
> 
> >From the outside it looks like the offending patch was done as a quick
> fix in a hurry (lots of copy-paste), and maybe it would be better to
> re-do it properly...

Silence... ?

I read UMS is still ignored, because e.g. that uninitialized
ring->irq_lock which I've wrote about above is for sure used e.g. in
gen6_render_ring_get_irq() added to ring vtable in
intel_render_ring_init_dri().

And also is copy-pasting, instead of properly structuring things, ok?


Why not revert what caused trouble and introduced other subtle bugs, and
redo things properly in the first place?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-08-09 12:08                         ` Kirill Smelkov
@ 2011-08-09 14:00                           ` Vasily Khoruzhick
  2011-08-09 14:47                             ` Kirill Smelkov
  0 siblings, 1 reply; 36+ messages in thread
From: Vasily Khoruzhick @ 2011-08-09 14:00 UTC (permalink / raw)
  To: intel-gfx
  Cc: Pekka Enberg, Herbert Xu, Luke-Jr, LKML, dri-devel,
	Rafael J. Wysocki, Ray Lee, Andrew Morton, Linus Torvalds

On Tuesday 09 August 2011 15:08:03 Kirill Smelkov wrote:
> On Tue, Jul 26, 2011 at 05:48:27PM +0400, Kirill Smelkov wrote:
> > On Sat, Jul 23, 2011 at 12:23:36AM +0400, Kirill Smelkov wrote:
> > > Keith,
> > > 
> > > first of all thanks for your prompt reply. Then...
> > > 
> > > On Fri, Jul 22, 2011 at 11:00:41AM -0700, Keith Packard wrote:
> > > > On Fri, 22 Jul 2011 15:08:06 +0400, Kirill Smelkov <kirr@mns.spb.ru> 
wrote:
> > > > > And now after v3.0 is out, I've tested it again, and yes, like it
> > > > > was broken on v3.0-rc5, it is (now even more) broken on v3.0 --
> > > > > after first
> > > > 
> > > > > bad io access the system freezes completely:
> > > > I looked at this when I first saw it (a couple of weeks ago), and I
> > > > couldn't see any obvious reason this patch would cause this
> > > > particular problem. I didn't want to revert the patch at that point
> > > > as I feared it would cause other subtle problems. Given that you've
> > > > got a work-around, it seemed best to just push this off past 3.0.
> > > 
> > > What kind of a workaround are you talking about? Sorry, to me it all
> > > looked like "UMS is being ignored forever". Anyway, let's move on to
> > > try to solve the issue.
> > > 
> > > > Given the failing address passed to ioread32, this seems like it's
> > > > probably the call to READ_BREADCRUMB -- I915_BREADCRUMB_INDEX is
> > > > 0x21, which is an offset in 32-bit units within the hardware status
> > > > page. If the status_page.page_addr value was zero, then the computed
> > > > address would end up being 0x84.
> > > > 
> > > > And, it looks like status_page.page_addr *will* end up being zero as
> > > > a result of the patch in question. The patch resets the entire ring
> > > > structure contents back to the initial values, which includes
> > > > smashing the status_page structure to zero, clearing the value of
> > > > status_page.page_addr set in i915_init_phys_hws.
> > > > 
> > > > Here's an untested patch which moves the initialization of
> > > > status_page.page_addr into intel_render_ring_init_dri. I note that
> > > > intel_init_render_ring_buffer *already* has the setting of the
> > > > status_page.page_addr value, and so I've removed the setting of
> > > > status_page.page_addr from i915_init_phys_hws.
> > > > 
> > > > I suspect we could remove the memset from
> > > > intel_init_render_ring_buffer; it seems entirely superfluous given
> > > > the memset in i915_init_phys_hws.
> > > > 
> > > > From 159ba1dd207fc52590ce8a3afd83f40bd2cedf46 Mon Sep 17 00:00:00
> > > > 2001 From: Keith Packard <keithp@keithp.com>
> > > > Date: Fri, 22 Jul 2011 10:44:39 -0700
> > > > Subject: [PATCH] drm/i915: Initialize RCS ring status page address in
> > > > 
> > > >  intel_render_ring_init_dri
> > > > 
> > > > Physically-addressed hardware status pages are initialized early in
> > > > the driver load process by i915_init_phys_hws. For UMS environments,
> > > > the ring structure is not initialized until the X server starts. At
> > > > that point, the entire ring structure is re-initialized with all new
> > > > values. Any values set in the ring structure (including
> > > > ring->status_page.page_addr) will be lost when the ring is
> > > > re-initialized.
> > > > 
> > > > This patch moves the initialization of the status_page.page_addr
> > > > value to intel_render_ring_init_dri.
> > > > 
> > > > Signed-off-by: Keith Packard <keithp@keithp.com>
> > > > ---
> > > > 
> > > >  drivers/gpu/drm/i915/i915_dma.c         |    6 ++----
> > > >  drivers/gpu/drm/i915/intel_ringbuffer.c |    3 +++
> > > >  2 files changed, 5 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_dma.c
> > > > b/drivers/gpu/drm/i915/i915_dma.c index 1271282..8a3942c 100644
> > > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > > @@ -61,7 +61,6 @@ static void i915_write_hws_pga(struct drm_device
> > > > *dev)
> > > > 
> > > >  static int i915_init_phys_hws(struct drm_device *dev)
> > > >  {
> > > >  
> > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > 
> > > > -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
> > > > 
> > > >  	/* Program Hardware Status Page */
> > > >  	dev_priv->status_page_dmah =
> > > > 
> > > > @@ -71,10 +70,9 @@ static int i915_init_phys_hws(struct drm_device
> > > > *dev)
> > > > 
> > > >  		DRM_ERROR("Can not allocate hardware status page\n");
> > > >  		return -ENOMEM;
> > > >  	
> > > >  	}
> > > > 
> > > > -	ring->status_page.page_addr =
> > > > -		(void __force __iomem *)dev_priv->status_page_dmah->vaddr;
> > > > 
> > > > -	memset_io(ring->status_page.page_addr, 0, PAGE_SIZE);
> > > > +	memset_io((void __force __iomem
> > > > *)dev_priv->status_page_dmah->vaddr, +		  0, PAGE_SIZE);
> > > > 
> > > >  	i915_write_hws_pga(dev);
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index e961568..47b9b27
> > > > 100644
> > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > @@ -1321,6 +1321,9 @@ int intel_render_ring_init_dri(struct
> > > > drm_device *dev, u64 start, u32 size)
> > > > 
> > > >  		ring->get_seqno = pc_render_get_seqno;
> > > >  	
> > > >  	}
> > > > 
> > > > +	if (!I915_NEED_GFX_HWS(dev))
> > > > +		ring->status_page.page_addr = dev_priv->status_page_dmah->vaddr;
> > > > +
> > > > 
> > > >  	ring->dev = dev;
> > > >  	INIT_LIST_HEAD(&ring->active_list);
> > > >  	INIT_LIST_HEAD(&ring->request_list);
> > > 
> > > I can't tell whether this is correct, because intel gfx driver is
> > > unknown to me, but from the first glance your description sounds
> > > reasonable.
> > > 
> > > I'm out of office till ~ next week's tuesday, and on return I'll try
> > > to test it on the hardware in question.
> > 
> > Keith, thanks again for the patch. As promised I've tested it on the
> > hardware in question and yes, bad_access is gone and X seems to work,
> > so thank you, but...
> > 
> > 
> > I see there are more such bugs in introduced-in-guilty-patch
> > intel_render_ring_init_dri(). For example ring->irq_queue is
> > left uninitialized and also ring->irq_lock etc...
> > 
> > 
> > I'm X newbie, so if here is something stupid X-wise, please don't
> > beat me too hard, but to me the gist of the problem is the original
> > patch, where Chris does
> > 
> > ( git show e8616b6ced6137085e6657cc63bc2fe3900b8616 )
> > 
> > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index 03e3370..51fbc5e
> > > 100644
> > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > @@ -1291,6 +1291,48 @@ int intel_init_render_ring_buffer(struct
> > > drm_device *dev)
> > > 
> > >         return intel_init_ring_buffer(dev, ring);
> > >  
> > >  }
> > > 
> > > +int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32
> > > size) +{
> > > +       drm_i915_private_t *dev_priv = dev->dev_private;
> > > +       struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> > > +
> > > +       *ring = render_ring;
> > > 
> >           ^^^^^^^^^^^^^^^^^^^
> >           here resets
> > > 
> > > +       if (INTEL_INFO(dev)->gen >= 6) {
> > > +               ring->add_request = gen6_add_request;
> > > +               ring->irq_get = gen6_render_ring_get_irq;
> > > +               ring->irq_put = gen6_render_ring_put_irq;
> > > +       } else if (IS_GEN5(dev)) {
> > > +               ring->add_request = pc_render_add_request;
> > > +               ring->get_seqno = pc_render_get_seqno;
> > > +       }
> > 
> > and then the rest of the `ring` is initialized seemingly copy-pasted
> > 
> > from intel_init_ring_buffer():
> > > +       ring->dev = dev;
> > > +       INIT_LIST_HEAD(&ring->active_list);
> > > +       INIT_LIST_HEAD(&ring->request_list);
> > > +       INIT_LIST_HEAD(&ring->gpu_write_list);
> > > +
> > > +       ring->size = size;
> > > +       ring->effective_size = ring->size;
> > > +       if (IS_I830(ring->dev))
> > > +               ring->effective_size -= 128;
> > > +
> > > +       ring->map.offset = start;
> > > +       ring->map.size = size;
> > > +       ring->map.type = 0;
> > > +       ring->map.flags = 0;
> > > +       ring->map.mtrr = 0;
> > 
> > ...
> > 
> > where both 3 chunks go almost exactly from intel_init_ring_buffer(), and
> > ring->effective_size tweak even stripped original comment:
> > 
> > # original version from intel_init_ring_buffer():
> >         /* Workaround an erratum on the i830 which causes a hang if
> >         
> >          * the TAIL pointer points to within the last 2 cachelines
> >          * of the buffer.
> >          */
> >         
> >         ring->effective_size = ring->size;
> >         if (IS_I830(ring->dev))
> >         
> >                 ring->effective_size -= 128;
> > 
> > ...
> > 
> > 
> > The line marked "here resets" resets all the fields, and maybe it's not a
> > good idea to re-initialize them all afterwards (missing some as this
> > thread show), or at least if it is really needed, share initialization
> > code between intel_render_ring_init_dri() and intel_init_ring_buffer() ?
> > 
> > >From the outside it looks like the offending patch was done as a quick
> > 
> > fix in a hurry (lots of copy-paste), and maybe it would be better to
> > re-do it properly...
> 
> Silence... ?
> 
> I read UMS is still ignored, because e.g. that uninitialized
> ring->irq_lock which I've wrote about above is for sure used e.g. in
> gen6_render_ring_get_irq() added to ring vtable in
> intel_render_ring_init_dri().

I really doubt that UMS supports gen6 hardware.

Regards
Vasily

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-08-09 14:00                           ` Vasily Khoruzhick
@ 2011-08-09 14:47                             ` Kirill Smelkov
  2011-08-09 15:09                               ` Vasily Khoruzhick
  0 siblings, 1 reply; 36+ messages in thread
From: Kirill Smelkov @ 2011-08-09 14:47 UTC (permalink / raw)
  To: Vasily Khoruzhick
  Cc: Pekka Enberg, Herbert Xu, Luke-Jr, intel-gfx, LKML, dri-devel,
	Rafael J. Wysocki, Ray Lee, Andrew Morton, Linus Torvalds

On Tue, Aug 09, 2011 at 05:00:52PM +0300, Vasily Khoruzhick wrote:
> On Tuesday 09 August 2011 15:08:03 Kirill Smelkov wrote:
> > On Tue, Jul 26, 2011 at 05:48:27PM +0400, Kirill Smelkov wrote:
> > > On Sat, Jul 23, 2011 at 12:23:36AM +0400, Kirill Smelkov wrote:
> > > > Keith,
> > > > 
> > > > first of all thanks for your prompt reply. Then...
> > > > 
> > > > On Fri, Jul 22, 2011 at 11:00:41AM -0700, Keith Packard wrote:
> > > > > On Fri, 22 Jul 2011 15:08:06 +0400, Kirill Smelkov <kirr@mns.spb.ru> 
> wrote:
> > > > > > And now after v3.0 is out, I've tested it again, and yes, like it
> > > > > > was broken on v3.0-rc5, it is (now even more) broken on v3.0 --
> > > > > > after first
> > > > > 
> > > > > > bad io access the system freezes completely:
> > > > > I looked at this when I first saw it (a couple of weeks ago), and I
> > > > > couldn't see any obvious reason this patch would cause this
> > > > > particular problem. I didn't want to revert the patch at that point
> > > > > as I feared it would cause other subtle problems. Given that you've
> > > > > got a work-around, it seemed best to just push this off past 3.0.
> > > > 
> > > > What kind of a workaround are you talking about? Sorry, to me it all
> > > > looked like "UMS is being ignored forever". Anyway, let's move on to
> > > > try to solve the issue.
> > > > 
> > > > > Given the failing address passed to ioread32, this seems like it's
> > > > > probably the call to READ_BREADCRUMB -- I915_BREADCRUMB_INDEX is
> > > > > 0x21, which is an offset in 32-bit units within the hardware status
> > > > > page. If the status_page.page_addr value was zero, then the computed
> > > > > address would end up being 0x84.
> > > > > 
> > > > > And, it looks like status_page.page_addr *will* end up being zero as
> > > > > a result of the patch in question. The patch resets the entire ring
> > > > > structure contents back to the initial values, which includes
> > > > > smashing the status_page structure to zero, clearing the value of
> > > > > status_page.page_addr set in i915_init_phys_hws.
> > > > > 
> > > > > Here's an untested patch which moves the initialization of
> > > > > status_page.page_addr into intel_render_ring_init_dri. I note that
> > > > > intel_init_render_ring_buffer *already* has the setting of the
> > > > > status_page.page_addr value, and so I've removed the setting of
> > > > > status_page.page_addr from i915_init_phys_hws.
> > > > > 
> > > > > I suspect we could remove the memset from
> > > > > intel_init_render_ring_buffer; it seems entirely superfluous given
> > > > > the memset in i915_init_phys_hws.
> > > > > 
> > > > > From 159ba1dd207fc52590ce8a3afd83f40bd2cedf46 Mon Sep 17 00:00:00
> > > > > 2001 From: Keith Packard <keithp@keithp.com>
> > > > > Date: Fri, 22 Jul 2011 10:44:39 -0700
> > > > > Subject: [PATCH] drm/i915: Initialize RCS ring status page address in
> > > > > 
> > > > >  intel_render_ring_init_dri
> > > > > 
> > > > > Physically-addressed hardware status pages are initialized early in
> > > > > the driver load process by i915_init_phys_hws. For UMS environments,
> > > > > the ring structure is not initialized until the X server starts. At
> > > > > that point, the entire ring structure is re-initialized with all new
> > > > > values. Any values set in the ring structure (including
> > > > > ring->status_page.page_addr) will be lost when the ring is
> > > > > re-initialized.
> > > > > 
> > > > > This patch moves the initialization of the status_page.page_addr
> > > > > value to intel_render_ring_init_dri.
> > > > > 
> > > > > Signed-off-by: Keith Packard <keithp@keithp.com>
> > > > > ---
> > > > > 
> > > > >  drivers/gpu/drm/i915/i915_dma.c         |    6 ++----
> > > > >  drivers/gpu/drm/i915/intel_ringbuffer.c |    3 +++
> > > > >  2 files changed, 5 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_dma.c
> > > > > b/drivers/gpu/drm/i915/i915_dma.c index 1271282..8a3942c 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > > > @@ -61,7 +61,6 @@ static void i915_write_hws_pga(struct drm_device
> > > > > *dev)
> > > > > 
> > > > >  static int i915_init_phys_hws(struct drm_device *dev)
> > > > >  {
> > > > >  
> > > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > > 
> > > > > -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
> > > > > 
> > > > >  	/* Program Hardware Status Page */
> > > > >  	dev_priv->status_page_dmah =
> > > > > 
> > > > > @@ -71,10 +70,9 @@ static int i915_init_phys_hws(struct drm_device
> > > > > *dev)
> > > > > 
> > > > >  		DRM_ERROR("Can not allocate hardware status page\n");
> > > > >  		return -ENOMEM;
> > > > >  	
> > > > >  	}
> > > > > 
> > > > > -	ring->status_page.page_addr =
> > > > > -		(void __force __iomem *)dev_priv->status_page_dmah->vaddr;
> > > > > 
> > > > > -	memset_io(ring->status_page.page_addr, 0, PAGE_SIZE);
> > > > > +	memset_io((void __force __iomem
> > > > > *)dev_priv->status_page_dmah->vaddr, +		  0, PAGE_SIZE);
> > > > > 
> > > > >  	i915_write_hws_pga(dev);
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index e961568..47b9b27
> > > > > 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > @@ -1321,6 +1321,9 @@ int intel_render_ring_init_dri(struct
> > > > > drm_device *dev, u64 start, u32 size)
> > > > > 
> > > > >  		ring->get_seqno = pc_render_get_seqno;
> > > > >  	
> > > > >  	}
> > > > > 
> > > > > +	if (!I915_NEED_GFX_HWS(dev))
> > > > > +		ring->status_page.page_addr = dev_priv->status_page_dmah->vaddr;
> > > > > +
> > > > > 
> > > > >  	ring->dev = dev;
> > > > >  	INIT_LIST_HEAD(&ring->active_list);
> > > > >  	INIT_LIST_HEAD(&ring->request_list);
> > > > 
> > > > I can't tell whether this is correct, because intel gfx driver is
> > > > unknown to me, but from the first glance your description sounds
> > > > reasonable.
> > > > 
> > > > I'm out of office till ~ next week's tuesday, and on return I'll try
> > > > to test it on the hardware in question.
> > > 
> > > Keith, thanks again for the patch. As promised I've tested it on the
> > > hardware in question and yes, bad_access is gone and X seems to work,
> > > so thank you, but...
> > > 
> > > 
> > > I see there are more such bugs in introduced-in-guilty-patch
> > > intel_render_ring_init_dri(). For example ring->irq_queue is
> > > left uninitialized and also ring->irq_lock etc...
> > > 
> > > 
> > > I'm X newbie, so if here is something stupid X-wise, please don't
> > > beat me too hard, but to me the gist of the problem is the original
> > > patch, where Chris does
> > > 
> > > ( git show e8616b6ced6137085e6657cc63bc2fe3900b8616 )
> > > 
> > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index 03e3370..51fbc5e
> > > > 100644
> > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > @@ -1291,6 +1291,48 @@ int intel_init_render_ring_buffer(struct
> > > > drm_device *dev)
> > > > 
> > > >         return intel_init_ring_buffer(dev, ring);
> > > >  
> > > >  }
> > > > 
> > > > +int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32
> > > > size) +{
> > > > +       drm_i915_private_t *dev_priv = dev->dev_private;
> > > > +       struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> > > > +
> > > > +       *ring = render_ring;
> > > > 
> > >           ^^^^^^^^^^^^^^^^^^^
> > >           here resets
> > > > 
> > > > +       if (INTEL_INFO(dev)->gen >= 6) {
> > > > +               ring->add_request = gen6_add_request;
> > > > +               ring->irq_get = gen6_render_ring_get_irq;
> > > > +               ring->irq_put = gen6_render_ring_put_irq;
> > > > +       } else if (IS_GEN5(dev)) {
> > > > +               ring->add_request = pc_render_add_request;
> > > > +               ring->get_seqno = pc_render_get_seqno;
> > > > +       }
> > > 
> > > and then the rest of the `ring` is initialized seemingly copy-pasted
> > > 
> > > from intel_init_ring_buffer():
> > > > +       ring->dev = dev;
> > > > +       INIT_LIST_HEAD(&ring->active_list);
> > > > +       INIT_LIST_HEAD(&ring->request_list);
> > > > +       INIT_LIST_HEAD(&ring->gpu_write_list);
> > > > +
> > > > +       ring->size = size;
> > > > +       ring->effective_size = ring->size;
> > > > +       if (IS_I830(ring->dev))
> > > > +               ring->effective_size -= 128;
> > > > +
> > > > +       ring->map.offset = start;
> > > > +       ring->map.size = size;
> > > > +       ring->map.type = 0;
> > > > +       ring->map.flags = 0;
> > > > +       ring->map.mtrr = 0;
> > > 
> > > ...
> > > 
> > > where both 3 chunks go almost exactly from intel_init_ring_buffer(), and
> > > ring->effective_size tweak even stripped original comment:
> > > 
> > > # original version from intel_init_ring_buffer():
> > >         /* Workaround an erratum on the i830 which causes a hang if
> > >         
> > >          * the TAIL pointer points to within the last 2 cachelines
> > >          * of the buffer.
> > >          */
> > >         
> > >         ring->effective_size = ring->size;
> > >         if (IS_I830(ring->dev))
> > >         
> > >                 ring->effective_size -= 128;
> > > 
> > > ...
> > > 
> > > 
> > > The line marked "here resets" resets all the fields, and maybe it's not a
> > > good idea to re-initialize them all afterwards (missing some as this
> > > thread show), or at least if it is really needed, share initialization
> > > code between intel_render_ring_init_dri() and intel_init_ring_buffer() ?
> > > 
> > > >From the outside it looks like the offending patch was done as a quick
> > > 
> > > fix in a hurry (lots of copy-paste), and maybe it would be better to
> > > re-do it properly...
> > 
> > Silence... ?
> > 
> > I read UMS is still ignored, because e.g. that uninitialized
> > ring->irq_lock which I've wrote about above is for sure used e.g. in
> > gen6_render_ring_get_irq() added to ring vtable in
> > intel_render_ring_init_dri().
> 
> I really doubt that UMS supports gen6 hardware.

Then why it is there in intel_render_ring_init_dri():

    int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32 size)
    {
    	drm_i915_private_t *dev_priv = dev->dev_private;
    	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
    
    	*ring = render_ring;
    	if (INTEL_INFO(dev)->gen >= 6) {
    		ring->add_request = gen6_add_request;
    		ring->irq_get = gen6_render_ring_get_irq;
    		ring->irq_put = gen6_render_ring_put_irq;
    	} else if (IS_GEN5(dev)) {
    		ring->add_request = pc_render_add_request;
    		ring->get_seqno = pc_render_get_seqno;
    	}


?


Added by the same guilty commit e8616b6c I'm talking about.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-08-09 14:47                             ` Kirill Smelkov
@ 2011-08-09 15:09                               ` Vasily Khoruzhick
  2011-08-09 15:34                                 ` Kirill Smelkov
  0 siblings, 1 reply; 36+ messages in thread
From: Vasily Khoruzhick @ 2011-08-09 15:09 UTC (permalink / raw)
  To: Kirill Smelkov
  Cc: Pekka Enberg, Herbert Xu, Luke-Jr, intel-gfx, LKML, dri-devel,
	Rafael J. Wysocki, Ray Lee, Andrew Morton, Linus Torvalds

On Tuesday 09 August 2011 17:47:56 Kirill Smelkov wrote:
> On Tue, Aug 09, 2011 at 05:00:52PM +0300, Vasily Khoruzhick wrote:
> > On Tuesday 09 August 2011 15:08:03 Kirill Smelkov wrote:
> > > On Tue, Jul 26, 2011 at 05:48:27PM +0400, Kirill Smelkov wrote:
> > > > On Sat, Jul 23, 2011 at 12:23:36AM +0400, Kirill Smelkov wrote:
> > > > > Keith,
> > > > > 
> > > > > first of all thanks for your prompt reply. Then...
> > > > > 
> > > > > On Fri, Jul 22, 2011 at 11:00:41AM -0700, Keith Packard wrote:
> > > > > > On Fri, 22 Jul 2011 15:08:06 +0400, Kirill Smelkov
> > > > > > <kirr@mns.spb.ru>
> > 
> > wrote:
> > > > > > > And now after v3.0 is out, I've tested it again, and yes, like
> > > > > > > it was broken on v3.0-rc5, it is (now even more) broken on
> > > > > > > v3.0 -- after first
> > > > > > 
> > > > > > > bad io access the system freezes completely:
> > > > > > I looked at this when I first saw it (a couple of weeks ago), and
> > > > > > I couldn't see any obvious reason this patch would cause this
> > > > > > particular problem. I didn't want to revert the patch at that
> > > > > > point as I feared it would cause other subtle problems. Given
> > > > > > that you've got a work-around, it seemed best to just push this
> > > > > > off past 3.0.
> > > > > 
> > > > > What kind of a workaround are you talking about? Sorry, to me it
> > > > > all looked like "UMS is being ignored forever". Anyway, let's move
> > > > > on to try to solve the issue.
> > > > > 
> > > > > > Given the failing address passed to ioread32, this seems like
> > > > > > it's probably the call to READ_BREADCRUMB --
> > > > > > I915_BREADCRUMB_INDEX is 0x21, which is an offset in 32-bit
> > > > > > units within the hardware status page. If the
> > > > > > status_page.page_addr value was zero, then the computed address
> > > > > > would end up being 0x84.
> > > > > > 
> > > > > > And, it looks like status_page.page_addr *will* end up being zero
> > > > > > as a result of the patch in question. The patch resets the
> > > > > > entire ring structure contents back to the initial values, which
> > > > > > includes smashing the status_page structure to zero, clearing
> > > > > > the value of status_page.page_addr set in i915_init_phys_hws.
> > > > > > 
> > > > > > Here's an untested patch which moves the initialization of
> > > > > > status_page.page_addr into intel_render_ring_init_dri. I note
> > > > > > that intel_init_render_ring_buffer *already* has the setting of
> > > > > > the status_page.page_addr value, and so I've removed the setting
> > > > > > of status_page.page_addr from i915_init_phys_hws.
> > > > > > 
> > > > > > I suspect we could remove the memset from
> > > > > > intel_init_render_ring_buffer; it seems entirely superfluous
> > > > > > given the memset in i915_init_phys_hws.
> > > > > > 
> > > > > > From 159ba1dd207fc52590ce8a3afd83f40bd2cedf46 Mon Sep 17 00:00:00
> > > > > > 2001 From: Keith Packard <keithp@keithp.com>
> > > > > > Date: Fri, 22 Jul 2011 10:44:39 -0700
> > > > > > Subject: [PATCH] drm/i915: Initialize RCS ring status page
> > > > > > address in
> > > > > > 
> > > > > >  intel_render_ring_init_dri
> > > > > > 
> > > > > > Physically-addressed hardware status pages are initialized early
> > > > > > in the driver load process by i915_init_phys_hws. For UMS
> > > > > > environments, the ring structure is not initialized until the X
> > > > > > server starts. At that point, the entire ring structure is
> > > > > > re-initialized with all new values. Any values set in the ring
> > > > > > structure (including
> > > > > > ring->status_page.page_addr) will be lost when the ring is
> > > > > > re-initialized.
> > > > > > 
> > > > > > This patch moves the initialization of the status_page.page_addr
> > > > > > value to intel_render_ring_init_dri.
> > > > > > 
> > > > > > Signed-off-by: Keith Packard <keithp@keithp.com>
> > > > > > ---
> > > > > > 
> > > > > >  drivers/gpu/drm/i915/i915_dma.c         |    6 ++----
> > > > > >  drivers/gpu/drm/i915/intel_ringbuffer.c |    3 +++
> > > > > >  2 files changed, 5 insertions(+), 4 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/i915_dma.c
> > > > > > b/drivers/gpu/drm/i915/i915_dma.c index 1271282..8a3942c 100644
> > > > > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > > > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > > > > @@ -61,7 +61,6 @@ static void i915_write_hws_pga(struct
> > > > > > drm_device *dev)
> > > > > > 
> > > > > >  static int i915_init_phys_hws(struct drm_device *dev)
> > > > > >  {
> > > > > >  
> > > > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > > > 
> > > > > > -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
> > > > > > 
> > > > > >  	/* Program Hardware Status Page */
> > > > > >  	dev_priv->status_page_dmah =
> > > > > > 
> > > > > > @@ -71,10 +70,9 @@ static int i915_init_phys_hws(struct
> > > > > > drm_device *dev)
> > > > > > 
> > > > > >  		DRM_ERROR("Can not allocate hardware status page\n");
> > > > > >  		return -ENOMEM;
> > > > > >  	
> > > > > >  	}
> > > > > > 
> > > > > > -	ring->status_page.page_addr =
> > > > > > -		(void __force __iomem *)dev_priv->status_page_dmah->vaddr;
> > > > > > 
> > > > > > -	memset_io(ring->status_page.page_addr, 0, PAGE_SIZE);
> > > > > > +	memset_io((void __force __iomem
> > > > > > *)dev_priv->status_page_dmah->vaddr, +		  0, PAGE_SIZE);
> > > > > > 
> > > > > >  	i915_write_hws_pga(dev);
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index e961568..47b9b27
> > > > > > 100644
> > > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > @@ -1321,6 +1321,9 @@ int intel_render_ring_init_dri(struct
> > > > > > drm_device *dev, u64 start, u32 size)
> > > > > > 
> > > > > >  		ring->get_seqno = pc_render_get_seqno;
> > > > > >  	
> > > > > >  	}
> > > > > > 
> > > > > > +	if (!I915_NEED_GFX_HWS(dev))
> > > > > > +		ring->status_page.page_addr =
> > > > > > dev_priv->status_page_dmah->vaddr; +
> > > > > > 
> > > > > >  	ring->dev = dev;
> > > > > >  	INIT_LIST_HEAD(&ring->active_list);
> > > > > >  	INIT_LIST_HEAD(&ring->request_list);
> > > > > 
> > > > > I can't tell whether this is correct, because intel gfx driver is
> > > > > unknown to me, but from the first glance your description sounds
> > > > > reasonable.
> > > > > 
> > > > > I'm out of office till ~ next week's tuesday, and on return I'll
> > > > > try to test it on the hardware in question.
> > > > 
> > > > Keith, thanks again for the patch. As promised I've tested it on the
> > > > hardware in question and yes, bad_access is gone and X seems to work,
> > > > so thank you, but...
> > > > 
> > > > 
> > > > I see there are more such bugs in introduced-in-guilty-patch
> > > > intel_render_ring_init_dri(). For example ring->irq_queue is
> > > > left uninitialized and also ring->irq_lock etc...
> > > > 
> > > > 
> > > > I'm X newbie, so if here is something stupid X-wise, please don't
> > > > beat me too hard, but to me the gist of the problem is the original
> > > > patch, where Chris does
> > > > 
> > > > ( git show e8616b6ced6137085e6657cc63bc2fe3900b8616 )
> > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index 03e3370..51fbc5e
> > > > > 100644
> > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > @@ -1291,6 +1291,48 @@ int intel_init_render_ring_buffer(struct
> > > > > drm_device *dev)
> > > > > 
> > > > >         return intel_init_ring_buffer(dev, ring);
> > > > >  
> > > > >  }
> > > > > 
> > > > > +int intel_render_ring_init_dri(struct drm_device *dev, u64 start,
> > > > > u32 size) +{
> > > > > +       drm_i915_private_t *dev_priv = dev->dev_private;
> > > > > +       struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> > > > > +
> > > > > +       *ring = render_ring;
> > > > > 
> > > >           ^^^^^^^^^^^^^^^^^^^
> > > >           here resets
> > > > > 
> > > > > +       if (INTEL_INFO(dev)->gen >= 6) {
> > > > > +               ring->add_request = gen6_add_request;
> > > > > +               ring->irq_get = gen6_render_ring_get_irq;
> > > > > +               ring->irq_put = gen6_render_ring_put_irq;
> > > > > +       } else if (IS_GEN5(dev)) {
> > > > > +               ring->add_request = pc_render_add_request;
> > > > > +               ring->get_seqno = pc_render_get_seqno;
> > > > > +       }
> > > > 
> > > > and then the rest of the `ring` is initialized seemingly copy-pasted
> > > > 
> > > > from intel_init_ring_buffer():
> > > > > +       ring->dev = dev;
> > > > > +       INIT_LIST_HEAD(&ring->active_list);
> > > > > +       INIT_LIST_HEAD(&ring->request_list);
> > > > > +       INIT_LIST_HEAD(&ring->gpu_write_list);
> > > > > +
> > > > > +       ring->size = size;
> > > > > +       ring->effective_size = ring->size;
> > > > > +       if (IS_I830(ring->dev))
> > > > > +               ring->effective_size -= 128;
> > > > > +
> > > > > +       ring->map.offset = start;
> > > > > +       ring->map.size = size;
> > > > > +       ring->map.type = 0;
> > > > > +       ring->map.flags = 0;
> > > > > +       ring->map.mtrr = 0;
> > > > 
> > > > ...
> > > > 
> > > > where both 3 chunks go almost exactly from intel_init_ring_buffer(),
> > > > and ring->effective_size tweak even stripped original comment:
> > > > 
> > > > # original version from intel_init_ring_buffer():
> > > >         /* Workaround an erratum on the i830 which causes a hang if
> > > >         
> > > >          * the TAIL pointer points to within the last 2 cachelines
> > > >          * of the buffer.
> > > >          */
> > > >         
> > > >         ring->effective_size = ring->size;
> > > >         if (IS_I830(ring->dev))
> > > >         
> > > >                 ring->effective_size -= 128;
> > > > 
> > > > ...
> > > > 
> > > > 
> > > > The line marked "here resets" resets all the fields, and maybe it's
> > > > not a good idea to re-initialize them all afterwards (missing some
> > > > as this thread show), or at least if it is really needed, share
> > > > initialization code between intel_render_ring_init_dri() and
> > > > intel_init_ring_buffer() ?
> > > > 
> > > > >From the outside it looks like the offending patch was done as a
> > > > >quick
> > > > 
> > > > fix in a hurry (lots of copy-paste), and maybe it would be better to
> > > > re-do it properly...
> > > 
> > > Silence... ?
> > > 
> > > I read UMS is still ignored, because e.g. that uninitialized
> > > ring->irq_lock which I've wrote about above is for sure used e.g. in
> > > gen6_render_ring_get_irq() added to ring vtable in
> > > intel_render_ring_init_dri().
> > 
> > I really doubt that UMS supports gen6 hardware.
> 
> Then why it is there in intel_render_ring_init_dri():
> 
>     int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32
> size) {
>     	drm_i915_private_t *dev_priv = dev->dev_private;
>     	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> 
>     	*ring = render_ring;
>     	if (INTEL_INFO(dev)->gen >= 6) {

This branch executes only when hw generation is 6 or newer.

>     		ring->add_request = gen6_add_request;
>     		ring->irq_get = gen6_render_ring_get_irq;
>     		ring->irq_put = gen6_render_ring_put_irq;
>     	} else if (IS_GEN5(dev)) {
>     		ring->add_request = pc_render_add_request;
>     		ring->get_seqno = pc_render_get_seqno;
>     	}

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-08-09 15:09                               ` Vasily Khoruzhick
@ 2011-08-09 15:34                                 ` Kirill Smelkov
  2011-08-09 16:02                                   ` Vasily Khoruzhick
  0 siblings, 1 reply; 36+ messages in thread
From: Kirill Smelkov @ 2011-08-09 15:34 UTC (permalink / raw)
  To: Vasily Khoruzhick
  Cc: Pekka Enberg, Herbert Xu, Luke-Jr, intel-gfx, LKML, dri-devel,
	Rafael J. Wysocki, Ray Lee, Andrew Morton, Linus Torvalds

On Tue, Aug 09, 2011 at 06:09:57PM +0300, Vasily Khoruzhick wrote:
> On Tuesday 09 August 2011 17:47:56 Kirill Smelkov wrote:
> > On Tue, Aug 09, 2011 at 05:00:52PM +0300, Vasily Khoruzhick wrote:
> > > On Tuesday 09 August 2011 15:08:03 Kirill Smelkov wrote:
> > > > On Tue, Jul 26, 2011 at 05:48:27PM +0400, Kirill Smelkov wrote:
> > > > > On Sat, Jul 23, 2011 at 12:23:36AM +0400, Kirill Smelkov wrote:
> > > > > > Keith,
> > > > > > 
> > > > > > first of all thanks for your prompt reply. Then...
> > > > > > 
> > > > > > On Fri, Jul 22, 2011 at 11:00:41AM -0700, Keith Packard wrote:
> > > > > > > On Fri, 22 Jul 2011 15:08:06 +0400, Kirill Smelkov
> > > > > > > <kirr@mns.spb.ru>
> > > 
> > > wrote:
> > > > > > > > And now after v3.0 is out, I've tested it again, and yes, like
> > > > > > > > it was broken on v3.0-rc5, it is (now even more) broken on
> > > > > > > > v3.0 -- after first
> > > > > > > 
> > > > > > > > bad io access the system freezes completely:
> > > > > > > I looked at this when I first saw it (a couple of weeks ago), and
> > > > > > > I couldn't see any obvious reason this patch would cause this
> > > > > > > particular problem. I didn't want to revert the patch at that
> > > > > > > point as I feared it would cause other subtle problems. Given
> > > > > > > that you've got a work-around, it seemed best to just push this
> > > > > > > off past 3.0.
> > > > > > 
> > > > > > What kind of a workaround are you talking about? Sorry, to me it
> > > > > > all looked like "UMS is being ignored forever". Anyway, let's move
> > > > > > on to try to solve the issue.
> > > > > > 
> > > > > > > Given the failing address passed to ioread32, this seems like
> > > > > > > it's probably the call to READ_BREADCRUMB --
> > > > > > > I915_BREADCRUMB_INDEX is 0x21, which is an offset in 32-bit
> > > > > > > units within the hardware status page. If the
> > > > > > > status_page.page_addr value was zero, then the computed address
> > > > > > > would end up being 0x84.
> > > > > > > 
> > > > > > > And, it looks like status_page.page_addr *will* end up being zero
> > > > > > > as a result of the patch in question. The patch resets the
> > > > > > > entire ring structure contents back to the initial values, which
> > > > > > > includes smashing the status_page structure to zero, clearing
> > > > > > > the value of status_page.page_addr set in i915_init_phys_hws.
> > > > > > > 
> > > > > > > Here's an untested patch which moves the initialization of
> > > > > > > status_page.page_addr into intel_render_ring_init_dri. I note
> > > > > > > that intel_init_render_ring_buffer *already* has the setting of
> > > > > > > the status_page.page_addr value, and so I've removed the setting
> > > > > > > of status_page.page_addr from i915_init_phys_hws.
> > > > > > > 
> > > > > > > I suspect we could remove the memset from
> > > > > > > intel_init_render_ring_buffer; it seems entirely superfluous
> > > > > > > given the memset in i915_init_phys_hws.
> > > > > > > 
> > > > > > > From 159ba1dd207fc52590ce8a3afd83f40bd2cedf46 Mon Sep 17 00:00:00
> > > > > > > 2001 From: Keith Packard <keithp@keithp.com>
> > > > > > > Date: Fri, 22 Jul 2011 10:44:39 -0700
> > > > > > > Subject: [PATCH] drm/i915: Initialize RCS ring status page
> > > > > > > address in
> > > > > > > 
> > > > > > >  intel_render_ring_init_dri
> > > > > > > 
> > > > > > > Physically-addressed hardware status pages are initialized early
> > > > > > > in the driver load process by i915_init_phys_hws. For UMS
> > > > > > > environments, the ring structure is not initialized until the X
> > > > > > > server starts. At that point, the entire ring structure is
> > > > > > > re-initialized with all new values. Any values set in the ring
> > > > > > > structure (including
> > > > > > > ring->status_page.page_addr) will be lost when the ring is
> > > > > > > re-initialized.
> > > > > > > 
> > > > > > > This patch moves the initialization of the status_page.page_addr
> > > > > > > value to intel_render_ring_init_dri.
> > > > > > > 
> > > > > > > Signed-off-by: Keith Packard <keithp@keithp.com>
> > > > > > > ---
> > > > > > > 
> > > > > > >  drivers/gpu/drm/i915/i915_dma.c         |    6 ++----
> > > > > > >  drivers/gpu/drm/i915/intel_ringbuffer.c |    3 +++
> > > > > > >  2 files changed, 5 insertions(+), 4 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > b/drivers/gpu/drm/i915/i915_dma.c index 1271282..8a3942c 100644
> > > > > > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > @@ -61,7 +61,6 @@ static void i915_write_hws_pga(struct
> > > > > > > drm_device *dev)
> > > > > > > 
> > > > > > >  static int i915_init_phys_hws(struct drm_device *dev)
> > > > > > >  {
> > > > > > >  
> > > > > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > > > > 
> > > > > > > -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
> > > > > > > 
> > > > > > >  	/* Program Hardware Status Page */
> > > > > > >  	dev_priv->status_page_dmah =
> > > > > > > 
> > > > > > > @@ -71,10 +70,9 @@ static int i915_init_phys_hws(struct
> > > > > > > drm_device *dev)
> > > > > > > 
> > > > > > >  		DRM_ERROR("Can not allocate hardware status page\n");
> > > > > > >  		return -ENOMEM;
> > > > > > >  	
> > > > > > >  	}
> > > > > > > 
> > > > > > > -	ring->status_page.page_addr =
> > > > > > > -		(void __force __iomem *)dev_priv->status_page_dmah->vaddr;
> > > > > > > 
> > > > > > > -	memset_io(ring->status_page.page_addr, 0, PAGE_SIZE);
> > > > > > > +	memset_io((void __force __iomem
> > > > > > > *)dev_priv->status_page_dmah->vaddr, +		  0, PAGE_SIZE);
> > > > > > > 
> > > > > > >  	i915_write_hws_pga(dev);
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index e961568..47b9b27
> > > > > > > 100644
> > > > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > @@ -1321,6 +1321,9 @@ int intel_render_ring_init_dri(struct
> > > > > > > drm_device *dev, u64 start, u32 size)
> > > > > > > 
> > > > > > >  		ring->get_seqno = pc_render_get_seqno;
> > > > > > >  	
> > > > > > >  	}
> > > > > > > 
> > > > > > > +	if (!I915_NEED_GFX_HWS(dev))
> > > > > > > +		ring->status_page.page_addr =
> > > > > > > dev_priv->status_page_dmah->vaddr; +
> > > > > > > 
> > > > > > >  	ring->dev = dev;
> > > > > > >  	INIT_LIST_HEAD(&ring->active_list);
> > > > > > >  	INIT_LIST_HEAD(&ring->request_list);
> > > > > > 
> > > > > > I can't tell whether this is correct, because intel gfx driver is
> > > > > > unknown to me, but from the first glance your description sounds
> > > > > > reasonable.
> > > > > > 
> > > > > > I'm out of office till ~ next week's tuesday, and on return I'll
> > > > > > try to test it on the hardware in question.
> > > > > 
> > > > > Keith, thanks again for the patch. As promised I've tested it on the
> > > > > hardware in question and yes, bad_access is gone and X seems to work,
> > > > > so thank you, but...
> > > > > 
> > > > > 
> > > > > I see there are more such bugs in introduced-in-guilty-patch
> > > > > intel_render_ring_init_dri(). For example ring->irq_queue is
> > > > > left uninitialized and also ring->irq_lock etc...
> > > > > 
> > > > > 
> > > > > I'm X newbie, so if here is something stupid X-wise, please don't
> > > > > beat me too hard, but to me the gist of the problem is the original
> > > > > patch, where Chris does
> > > > > 
> > > > > ( git show e8616b6ced6137085e6657cc63bc2fe3900b8616 )
> > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index 03e3370..51fbc5e
> > > > > > 100644
> > > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > @@ -1291,6 +1291,48 @@ int intel_init_render_ring_buffer(struct
> > > > > > drm_device *dev)
> > > > > > 
> > > > > >         return intel_init_ring_buffer(dev, ring);
> > > > > >  
> > > > > >  }
> > > > > > 
> > > > > > +int intel_render_ring_init_dri(struct drm_device *dev, u64 start,
> > > > > > u32 size) +{
> > > > > > +       drm_i915_private_t *dev_priv = dev->dev_private;
> > > > > > +       struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> > > > > > +
> > > > > > +       *ring = render_ring;
> > > > > > 
> > > > >           ^^^^^^^^^^^^^^^^^^^
> > > > >           here resets
> > > > > > 
> > > > > > +       if (INTEL_INFO(dev)->gen >= 6) {
> > > > > > +               ring->add_request = gen6_add_request;
> > > > > > +               ring->irq_get = gen6_render_ring_get_irq;
> > > > > > +               ring->irq_put = gen6_render_ring_put_irq;
> > > > > > +       } else if (IS_GEN5(dev)) {
> > > > > > +               ring->add_request = pc_render_add_request;
> > > > > > +               ring->get_seqno = pc_render_get_seqno;
> > > > > > +       }
> > > > > 
> > > > > and then the rest of the `ring` is initialized seemingly copy-pasted
> > > > > 
> > > > > from intel_init_ring_buffer():
> > > > > > +       ring->dev = dev;
> > > > > > +       INIT_LIST_HEAD(&ring->active_list);
> > > > > > +       INIT_LIST_HEAD(&ring->request_list);
> > > > > > +       INIT_LIST_HEAD(&ring->gpu_write_list);
> > > > > > +
> > > > > > +       ring->size = size;
> > > > > > +       ring->effective_size = ring->size;
> > > > > > +       if (IS_I830(ring->dev))
> > > > > > +               ring->effective_size -= 128;
> > > > > > +
> > > > > > +       ring->map.offset = start;
> > > > > > +       ring->map.size = size;
> > > > > > +       ring->map.type = 0;
> > > > > > +       ring->map.flags = 0;
> > > > > > +       ring->map.mtrr = 0;
> > > > > 
> > > > > ...
> > > > > 
> > > > > where both 3 chunks go almost exactly from intel_init_ring_buffer(),
> > > > > and ring->effective_size tweak even stripped original comment:
> > > > > 
> > > > > # original version from intel_init_ring_buffer():
> > > > >         /* Workaround an erratum on the i830 which causes a hang if
> > > > >         
> > > > >          * the TAIL pointer points to within the last 2 cachelines
> > > > >          * of the buffer.
> > > > >          */
> > > > >         
> > > > >         ring->effective_size = ring->size;
> > > > >         if (IS_I830(ring->dev))
> > > > >         
> > > > >                 ring->effective_size -= 128;
> > > > > 
> > > > > ...
> > > > > 
> > > > > 
> > > > > The line marked "here resets" resets all the fields, and maybe it's
> > > > > not a good idea to re-initialize them all afterwards (missing some
> > > > > as this thread show), or at least if it is really needed, share
> > > > > initialization code between intel_render_ring_init_dri() and
> > > > > intel_init_ring_buffer() ?
> > > > > 
> > > > > >From the outside it looks like the offending patch was done as a
> > > > > >quick
> > > > > 
> > > > > fix in a hurry (lots of copy-paste), and maybe it would be better to
> > > > > re-do it properly...
> > > > 
> > > > Silence... ?
> > > > 
> > > > I read UMS is still ignored, because e.g. that uninitialized
> > > > ring->irq_lock which I've wrote about above is for sure used e.g. in
> > > > gen6_render_ring_get_irq() added to ring vtable in
> > > > intel_render_ring_init_dri().
> > > 
> > > I really doubt that UMS supports gen6 hardware.
> > 
> > Then why it is there in intel_render_ring_init_dri():
> > 
> >     int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32
> > size) {
> >     	drm_i915_private_t *dev_priv = dev->dev_private;
> >     	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> > 
> >     	*ring = render_ring;
> >     	if (INTEL_INFO(dev)->gen >= 6) {
> 
> This branch executes only when hw generation is 6 or newer.

and adds gen6_render_ring_get_irq() to vtable which uses ring->irq_lock
which is left uninitialized.

I don't understand what you were trying to say. How does it matter if
some branch executes only for such-and-such hardware, when this branch
contains bugs? Could you please clarify?


> >     		ring->add_request = gen6_add_request;
> >     		ring->irq_get = gen6_render_ring_get_irq;
> >     		ring->irq_put = gen6_render_ring_put_irq;
> >     	} else if (IS_GEN5(dev)) {
> >     		ring->add_request = pc_render_add_request;
> >     		ring->get_seqno = pc_render_get_seqno;
> >     	}

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-08-09 15:34                                 ` Kirill Smelkov
@ 2011-08-09 16:02                                   ` Vasily Khoruzhick
  2011-08-09 16:32                                     ` [Intel-gfx] " Kirill Smelkov
  0 siblings, 1 reply; 36+ messages in thread
From: Vasily Khoruzhick @ 2011-08-09 16:02 UTC (permalink / raw)
  To: Kirill Smelkov
  Cc: Pekka Enberg, Herbert Xu, Luke-Jr, intel-gfx, LKML, dri-devel,
	Rafael J. Wysocki, Ray Lee, Andrew Morton, Linus Torvalds

On Tuesday 09 August 2011 18:34:46 Kirill Smelkov wrote:
> On Tue, Aug 09, 2011 at 06:09:57PM +0300, Vasily Khoruzhick wrote:
> > On Tuesday 09 August 2011 17:47:56 Kirill Smelkov wrote:
> > > On Tue, Aug 09, 2011 at 05:00:52PM +0300, Vasily Khoruzhick wrote:
> > > > On Tuesday 09 August 2011 15:08:03 Kirill Smelkov wrote:
> > > > > On Tue, Jul 26, 2011 at 05:48:27PM +0400, Kirill Smelkov wrote:
> > > > > > On Sat, Jul 23, 2011 at 12:23:36AM +0400, Kirill Smelkov wrote:
> > > > > > > Keith,
> > > > > > > 
> > > > > > > first of all thanks for your prompt reply. Then...
> > > > > > > 
> > > > > > > On Fri, Jul 22, 2011 at 11:00:41AM -0700, Keith Packard wrote:
> > > > > > > > On Fri, 22 Jul 2011 15:08:06 +0400, Kirill Smelkov
> > > > > > > > <kirr@mns.spb.ru>
> > > > 
> > > > wrote:
> > > > > > > > > And now after v3.0 is out, I've tested it again, and yes,
> > > > > > > > > like it was broken on v3.0-rc5, it is (now even more)
> > > > > > > > > broken on v3.0 -- after first
> > > > > > > > 
> > > > > > > > > bad io access the system freezes completely:
> > > > > > > > I looked at this when I first saw it (a couple of weeks ago),
> > > > > > > > and I couldn't see any obvious reason this patch would cause
> > > > > > > > this particular problem. I didn't want to revert the patch
> > > > > > > > at that point as I feared it would cause other subtle
> > > > > > > > problems. Given that you've got a work-around, it seemed
> > > > > > > > best to just push this off past 3.0.
> > > > > > > 
> > > > > > > What kind of a workaround are you talking about? Sorry, to me
> > > > > > > it all looked like "UMS is being ignored forever". Anyway,
> > > > > > > let's move on to try to solve the issue.
> > > > > > > 
> > > > > > > > Given the failing address passed to ioread32, this seems like
> > > > > > > > it's probably the call to READ_BREADCRUMB --
> > > > > > > > I915_BREADCRUMB_INDEX is 0x21, which is an offset in 32-bit
> > > > > > > > units within the hardware status page. If the
> > > > > > > > status_page.page_addr value was zero, then the computed
> > > > > > > > address would end up being 0x84.
> > > > > > > > 
> > > > > > > > And, it looks like status_page.page_addr *will* end up being
> > > > > > > > zero as a result of the patch in question. The patch resets
> > > > > > > > the entire ring structure contents back to the initial
> > > > > > > > values, which includes smashing the status_page structure to
> > > > > > > > zero, clearing the value of status_page.page_addr set in
> > > > > > > > i915_init_phys_hws.
> > > > > > > > 
> > > > > > > > Here's an untested patch which moves the initialization of
> > > > > > > > status_page.page_addr into intel_render_ring_init_dri. I note
> > > > > > > > that intel_init_render_ring_buffer *already* has the setting
> > > > > > > > of the status_page.page_addr value, and so I've removed the
> > > > > > > > setting of status_page.page_addr from i915_init_phys_hws.
> > > > > > > > 
> > > > > > > > I suspect we could remove the memset from
> > > > > > > > intel_init_render_ring_buffer; it seems entirely superfluous
> > > > > > > > given the memset in i915_init_phys_hws.
> > > > > > > > 
> > > > > > > > From 159ba1dd207fc52590ce8a3afd83f40bd2cedf46 Mon Sep 17
> > > > > > > > 00:00:00 2001 From: Keith Packard <keithp@keithp.com>
> > > > > > > > Date: Fri, 22 Jul 2011 10:44:39 -0700
> > > > > > > > Subject: [PATCH] drm/i915: Initialize RCS ring status page
> > > > > > > > address in
> > > > > > > > 
> > > > > > > >  intel_render_ring_init_dri
> > > > > > > > 
> > > > > > > > Physically-addressed hardware status pages are initialized
> > > > > > > > early in the driver load process by i915_init_phys_hws. For
> > > > > > > > UMS environments, the ring structure is not initialized
> > > > > > > > until the X server starts. At that point, the entire ring
> > > > > > > > structure is re-initialized with all new values. Any values
> > > > > > > > set in the ring structure (including
> > > > > > > > ring->status_page.page_addr) will be lost when the ring is
> > > > > > > > re-initialized.
> > > > > > > > 
> > > > > > > > This patch moves the initialization of the
> > > > > > > > status_page.page_addr value to intel_render_ring_init_dri.
> > > > > > > > 
> > > > > > > > Signed-off-by: Keith Packard <keithp@keithp.com>
> > > > > > > > ---
> > > > > > > > 
> > > > > > > >  drivers/gpu/drm/i915/i915_dma.c         |    6 ++----
> > > > > > > >  drivers/gpu/drm/i915/intel_ringbuffer.c |    3 +++
> > > > > > > >  2 files changed, 5 insertions(+), 4 deletions(-)
> > > > > > > > 
> > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > > b/drivers/gpu/drm/i915/i915_dma.c index 1271282..8a3942c
> > > > > > > > 100644 --- a/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > > @@ -61,7 +61,6 @@ static void i915_write_hws_pga(struct
> > > > > > > > drm_device *dev)
> > > > > > > > 
> > > > > > > >  static int i915_init_phys_hws(struct drm_device *dev)
> > > > > > > >  {
> > > > > > > >  
> > > > > > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > > > > > 
> > > > > > > > -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
> > > > > > > > 
> > > > > > > >  	/* Program Hardware Status Page */
> > > > > > > >  	dev_priv->status_page_dmah =
> > > > > > > > 
> > > > > > > > @@ -71,10 +70,9 @@ static int i915_init_phys_hws(struct
> > > > > > > > drm_device *dev)
> > > > > > > > 
> > > > > > > >  		DRM_ERROR("Can not allocate hardware status page\n");
> > > > > > > >  		return -ENOMEM;
> > > > > > > >  	
> > > > > > > >  	}
> > > > > > > > 
> > > > > > > > -	ring->status_page.page_addr =
> > > > > > > > -		(void __force __iomem *)dev_priv->status_page_dmah-
>vaddr;
> > > > > > > > 
> > > > > > > > -	memset_io(ring->status_page.page_addr, 0, PAGE_SIZE);
> > > > > > > > +	memset_io((void __force __iomem
> > > > > > > > *)dev_priv->status_page_dmah->vaddr, +		  0, PAGE_SIZE);
> > > > > > > > 
> > > > > > > >  	i915_write_hws_pga(dev);
> > > > > > > > 
> > > > > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index
> > > > > > > > e961568..47b9b27 100644
> > > > > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > > @@ -1321,6 +1321,9 @@ int intel_render_ring_init_dri(struct
> > > > > > > > drm_device *dev, u64 start, u32 size)
> > > > > > > > 
> > > > > > > >  		ring->get_seqno = pc_render_get_seqno;
> > > > > > > >  	
> > > > > > > >  	}
> > > > > > > > 
> > > > > > > > +	if (!I915_NEED_GFX_HWS(dev))
> > > > > > > > +		ring->status_page.page_addr =
> > > > > > > > dev_priv->status_page_dmah->vaddr; +
> > > > > > > > 
> > > > > > > >  	ring->dev = dev;
> > > > > > > >  	INIT_LIST_HEAD(&ring->active_list);
> > > > > > > >  	INIT_LIST_HEAD(&ring->request_list);
> > > > > > > 
> > > > > > > I can't tell whether this is correct, because intel gfx driver
> > > > > > > is unknown to me, but from the first glance your description
> > > > > > > sounds reasonable.
> > > > > > > 
> > > > > > > I'm out of office till ~ next week's tuesday, and on return
> > > > > > > I'll try to test it on the hardware in question.
> > > > > > 
> > > > > > Keith, thanks again for the patch. As promised I've tested it on
> > > > > > the hardware in question and yes, bad_access is gone and X seems
> > > > > > to work, so thank you, but...
> > > > > > 
> > > > > > 
> > > > > > I see there are more such bugs in introduced-in-guilty-patch
> > > > > > intel_render_ring_init_dri(). For example ring->irq_queue is
> > > > > > left uninitialized and also ring->irq_lock etc...
> > > > > > 
> > > > > > 
> > > > > > I'm X newbie, so if here is something stupid X-wise, please don't
> > > > > > beat me too hard, but to me the gist of the problem is the
> > > > > > original patch, where Chris does
> > > > > > 
> > > > > > ( git show e8616b6ced6137085e6657cc63bc2fe3900b8616 )
> > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index
> > > > > > > 03e3370..51fbc5e 100644
> > > > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > @@ -1291,6 +1291,48 @@ int intel_init_render_ring_buffer(struct
> > > > > > > drm_device *dev)
> > > > > > > 
> > > > > > >         return intel_init_ring_buffer(dev, ring);
> > > > > > >  
> > > > > > >  }
> > > > > > > 
> > > > > > > +int intel_render_ring_init_dri(struct drm_device *dev, u64
> > > > > > > start, u32 size) +{
> > > > > > > +       drm_i915_private_t *dev_priv = dev->dev_private;
> > > > > > > +       struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> > > > > > > +
> > > > > > > +       *ring = render_ring;
> > > > > > > 
> > > > > >           ^^^^^^^^^^^^^^^^^^^
> > > > > >           here resets
> > > > > > > 
> > > > > > > +       if (INTEL_INFO(dev)->gen >= 6) {
> > > > > > > +               ring->add_request = gen6_add_request;
> > > > > > > +               ring->irq_get = gen6_render_ring_get_irq;
> > > > > > > +               ring->irq_put = gen6_render_ring_put_irq;
> > > > > > > +       } else if (IS_GEN5(dev)) {
> > > > > > > +               ring->add_request = pc_render_add_request;
> > > > > > > +               ring->get_seqno = pc_render_get_seqno;
> > > > > > > +       }
> > > > > > 
> > > > > > and then the rest of the `ring` is initialized seemingly
> > > > > > copy-pasted
> > > > > > 
> > > > > > from intel_init_ring_buffer():
> > > > > > > +       ring->dev = dev;
> > > > > > > +       INIT_LIST_HEAD(&ring->active_list);
> > > > > > > +       INIT_LIST_HEAD(&ring->request_list);
> > > > > > > +       INIT_LIST_HEAD(&ring->gpu_write_list);
> > > > > > > +
> > > > > > > +       ring->size = size;
> > > > > > > +       ring->effective_size = ring->size;
> > > > > > > +       if (IS_I830(ring->dev))
> > > > > > > +               ring->effective_size -= 128;
> > > > > > > +
> > > > > > > +       ring->map.offset = start;
> > > > > > > +       ring->map.size = size;
> > > > > > > +       ring->map.type = 0;
> > > > > > > +       ring->map.flags = 0;
> > > > > > > +       ring->map.mtrr = 0;
> > > > > > 
> > > > > > ...
> > > > > > 
> > > > > > where both 3 chunks go almost exactly from
> > > > > > intel_init_ring_buffer(), and ring->effective_size tweak even
> > > > > > stripped original comment:
> > > > > > 
> > > > > > # original version from intel_init_ring_buffer():
> > > > > >         /* Workaround an erratum on the i830 which causes a hang
> > > > > >         if
> > > > > >         
> > > > > >          * the TAIL pointer points to within the last 2
> > > > > >          cachelines * of the buffer.
> > > > > >          */
> > > > > >         
> > > > > >         ring->effective_size = ring->size;
> > > > > >         if (IS_I830(ring->dev))
> > > > > >         
> > > > > >                 ring->effective_size -= 128;
> > > > > > 
> > > > > > ...
> > > > > > 
> > > > > > 
> > > > > > The line marked "here resets" resets all the fields, and maybe
> > > > > > it's not a good idea to re-initialize them all afterwards
> > > > > > (missing some as this thread show), or at least if it is really
> > > > > > needed, share initialization code between
> > > > > > intel_render_ring_init_dri() and intel_init_ring_buffer() ?
> > > > > > 
> > > > > > >From the outside it looks like the offending patch was done as a
> > > > > > >quick
> > > > > > 
> > > > > > fix in a hurry (lots of copy-paste), and maybe it would be better
> > > > > > to re-do it properly...
> > > > > 
> > > > > Silence... ?
> > > > > 
> > > > > I read UMS is still ignored, because e.g. that uninitialized
> > > > > ring->irq_lock which I've wrote about above is for sure used e.g.
> > > > > in gen6_render_ring_get_irq() added to ring vtable in
> > > > > intel_render_ring_init_dri().
> > > > 
> > > > I really doubt that UMS supports gen6 hardware.
> > > 
> > > Then why it is there in intel_render_ring_init_dri():
> > >     int intel_render_ring_init_dri(struct drm_device *dev, u64 start,
> > >     u32
> > > 
> > > size) {
> > > 
> > >     	drm_i915_private_t *dev_priv = dev->dev_private;
> > >     	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> > >     	
> > >     	*ring = render_ring;
> > >     	if (INTEL_INFO(dev)->gen >= 6) {
> > 
> > This branch executes only when hw generation is 6 or newer.
> 
> and adds gen6_render_ring_get_irq() to vtable which uses ring->irq_lock
> which is left uninitialized.
> 
> I don't understand what you were trying to say. How does it matter if
> some branch executes only for such-and-such hardware, when this branch
> contains bugs? Could you please clarify?

I want to say that xf86-video-intel with gen6 support does not support UMS. So 
you can't even hit this "bug".

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Intel-gfx] Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-08-09 16:02                                   ` Vasily Khoruzhick
@ 2011-08-09 16:32                                     ` Kirill Smelkov
  2011-08-09 16:56                                       ` Ray Lee
  0 siblings, 1 reply; 36+ messages in thread
From: Kirill Smelkov @ 2011-08-09 16:32 UTC (permalink / raw)
  To: Vasily Khoruzhick
  Cc: intel-gfx, Keith Packard, Rafael J. Wysocki, Herbert Xu, Luke-Jr,
	LKML, dri-devel, Pekka Enberg, Ray Lee, Andrew Morton,
	Linus Torvalds

On Tue, Aug 09, 2011 at 07:02:59PM +0300, Vasily Khoruzhick wrote:
> On Tuesday 09 August 2011 18:34:46 Kirill Smelkov wrote:
> > On Tue, Aug 09, 2011 at 06:09:57PM +0300, Vasily Khoruzhick wrote:
> > > On Tuesday 09 August 2011 17:47:56 Kirill Smelkov wrote:
> > > > On Tue, Aug 09, 2011 at 05:00:52PM +0300, Vasily Khoruzhick wrote:
> > > > > On Tuesday 09 August 2011 15:08:03 Kirill Smelkov wrote:
> > > > > > On Tue, Jul 26, 2011 at 05:48:27PM +0400, Kirill Smelkov wrote:
> > > > > > > On Sat, Jul 23, 2011 at 12:23:36AM +0400, Kirill Smelkov wrote:
> > > > > > > > Keith,
> > > > > > > > 
> > > > > > > > first of all thanks for your prompt reply. Then...
> > > > > > > > 
> > > > > > > > On Fri, Jul 22, 2011 at 11:00:41AM -0700, Keith Packard wrote:
> > > > > > > > > On Fri, 22 Jul 2011 15:08:06 +0400, Kirill Smelkov
> > > > > > > > > <kirr@mns.spb.ru>
> > > > > 
> > > > > wrote:
> > > > > > > > > > And now after v3.0 is out, I've tested it again, and yes,
> > > > > > > > > > like it was broken on v3.0-rc5, it is (now even more)
> > > > > > > > > > broken on v3.0 -- after first
> > > > > > > > > 
> > > > > > > > > > bad io access the system freezes completely:
> > > > > > > > > I looked at this when I first saw it (a couple of weeks ago),
> > > > > > > > > and I couldn't see any obvious reason this patch would cause
> > > > > > > > > this particular problem. I didn't want to revert the patch
> > > > > > > > > at that point as I feared it would cause other subtle
> > > > > > > > > problems. Given that you've got a work-around, it seemed
> > > > > > > > > best to just push this off past 3.0.
> > > > > > > > 
> > > > > > > > What kind of a workaround are you talking about? Sorry, to me
> > > > > > > > it all looked like "UMS is being ignored forever". Anyway,
> > > > > > > > let's move on to try to solve the issue.
> > > > > > > > 
> > > > > > > > > Given the failing address passed to ioread32, this seems like
> > > > > > > > > it's probably the call to READ_BREADCRUMB --
> > > > > > > > > I915_BREADCRUMB_INDEX is 0x21, which is an offset in 32-bit
> > > > > > > > > units within the hardware status page. If the
> > > > > > > > > status_page.page_addr value was zero, then the computed
> > > > > > > > > address would end up being 0x84.
> > > > > > > > > 
> > > > > > > > > And, it looks like status_page.page_addr *will* end up being
> > > > > > > > > zero as a result of the patch in question. The patch resets
> > > > > > > > > the entire ring structure contents back to the initial
> > > > > > > > > values, which includes smashing the status_page structure to
> > > > > > > > > zero, clearing the value of status_page.page_addr set in
> > > > > > > > > i915_init_phys_hws.
> > > > > > > > > 
> > > > > > > > > Here's an untested patch which moves the initialization of
> > > > > > > > > status_page.page_addr into intel_render_ring_init_dri. I note
> > > > > > > > > that intel_init_render_ring_buffer *already* has the setting
> > > > > > > > > of the status_page.page_addr value, and so I've removed the
> > > > > > > > > setting of status_page.page_addr from i915_init_phys_hws.
> > > > > > > > > 
> > > > > > > > > I suspect we could remove the memset from
> > > > > > > > > intel_init_render_ring_buffer; it seems entirely superfluous
> > > > > > > > > given the memset in i915_init_phys_hws.
> > > > > > > > > 
> > > > > > > > > From 159ba1dd207fc52590ce8a3afd83f40bd2cedf46 Mon Sep 17
> > > > > > > > > 00:00:00 2001 From: Keith Packard <keithp@keithp.com>
> > > > > > > > > Date: Fri, 22 Jul 2011 10:44:39 -0700
> > > > > > > > > Subject: [PATCH] drm/i915: Initialize RCS ring status page
> > > > > > > > > address in
> > > > > > > > > 
> > > > > > > > >  intel_render_ring_init_dri
> > > > > > > > > 
> > > > > > > > > Physically-addressed hardware status pages are initialized
> > > > > > > > > early in the driver load process by i915_init_phys_hws. For
> > > > > > > > > UMS environments, the ring structure is not initialized
> > > > > > > > > until the X server starts. At that point, the entire ring
> > > > > > > > > structure is re-initialized with all new values. Any values
> > > > > > > > > set in the ring structure (including
> > > > > > > > > ring->status_page.page_addr) will be lost when the ring is
> > > > > > > > > re-initialized.
> > > > > > > > > 
> > > > > > > > > This patch moves the initialization of the
> > > > > > > > > status_page.page_addr value to intel_render_ring_init_dri.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Keith Packard <keithp@keithp.com>
> > > > > > > > > ---
> > > > > > > > > 
> > > > > > > > >  drivers/gpu/drm/i915/i915_dma.c         |    6 ++----
> > > > > > > > >  drivers/gpu/drm/i915/intel_ringbuffer.c |    3 +++
> > > > > > > > >  2 files changed, 5 insertions(+), 4 deletions(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > > > b/drivers/gpu/drm/i915/i915_dma.c index 1271282..8a3942c
> > > > > > > > > 100644 --- a/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > > > @@ -61,7 +61,6 @@ static void i915_write_hws_pga(struct
> > > > > > > > > drm_device *dev)
> > > > > > > > > 
> > > > > > > > >  static int i915_init_phys_hws(struct drm_device *dev)
> > > > > > > > >  {
> > > > > > > > >  
> > > > > > > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > > > > > > 
> > > > > > > > > -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
> > > > > > > > > 
> > > > > > > > >  	/* Program Hardware Status Page */
> > > > > > > > >  	dev_priv->status_page_dmah =
> > > > > > > > > 
> > > > > > > > > @@ -71,10 +70,9 @@ static int i915_init_phys_hws(struct
> > > > > > > > > drm_device *dev)
> > > > > > > > > 
> > > > > > > > >  		DRM_ERROR("Can not allocate hardware status page\n");
> > > > > > > > >  		return -ENOMEM;
> > > > > > > > >  	
> > > > > > > > >  	}
> > > > > > > > > 
> > > > > > > > > -	ring->status_page.page_addr =
> > > > > > > > > -		(void __force __iomem *)dev_priv->status_page_dmah-
> >vaddr;
> > > > > > > > > 
> > > > > > > > > -	memset_io(ring->status_page.page_addr, 0, PAGE_SIZE);
> > > > > > > > > +	memset_io((void __force __iomem
> > > > > > > > > *)dev_priv->status_page_dmah->vaddr, +		  0, PAGE_SIZE);
> > > > > > > > > 
> > > > > > > > >  	i915_write_hws_pga(dev);
> > > > > > > > > 
> > > > > > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index
> > > > > > > > > e961568..47b9b27 100644
> > > > > > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > > > @@ -1321,6 +1321,9 @@ int intel_render_ring_init_dri(struct
> > > > > > > > > drm_device *dev, u64 start, u32 size)
> > > > > > > > > 
> > > > > > > > >  		ring->get_seqno = pc_render_get_seqno;
> > > > > > > > >  	
> > > > > > > > >  	}
> > > > > > > > > 
> > > > > > > > > +	if (!I915_NEED_GFX_HWS(dev))
> > > > > > > > > +		ring->status_page.page_addr =
> > > > > > > > > dev_priv->status_page_dmah->vaddr; +
> > > > > > > > > 
> > > > > > > > >  	ring->dev = dev;
> > > > > > > > >  	INIT_LIST_HEAD(&ring->active_list);
> > > > > > > > >  	INIT_LIST_HEAD(&ring->request_list);
> > > > > > > > 
> > > > > > > > I can't tell whether this is correct, because intel gfx driver
> > > > > > > > is unknown to me, but from the first glance your description
> > > > > > > > sounds reasonable.
> > > > > > > > 
> > > > > > > > I'm out of office till ~ next week's tuesday, and on return
> > > > > > > > I'll try to test it on the hardware in question.
> > > > > > > 
> > > > > > > Keith, thanks again for the patch. As promised I've tested it on
> > > > > > > the hardware in question and yes, bad_access is gone and X seems
> > > > > > > to work, so thank you, but...
> > > > > > > 
> > > > > > > 
> > > > > > > I see there are more such bugs in introduced-in-guilty-patch
> > > > > > > intel_render_ring_init_dri(). For example ring->irq_queue is
> > > > > > > left uninitialized and also ring->irq_lock etc...
> > > > > > > 
> > > > > > > 
> > > > > > > I'm X newbie, so if here is something stupid X-wise, please don't
> > > > > > > beat me too hard, but to me the gist of the problem is the
> > > > > > > original patch, where Chris does
> > > > > > > 
> > > > > > > ( git show e8616b6ced6137085e6657cc63bc2fe3900b8616 )
> > > > > > > 
> > > > > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index
> > > > > > > > 03e3370..51fbc5e 100644
> > > > > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > > @@ -1291,6 +1291,48 @@ int intel_init_render_ring_buffer(struct
> > > > > > > > drm_device *dev)
> > > > > > > > 
> > > > > > > >         return intel_init_ring_buffer(dev, ring);
> > > > > > > >  
> > > > > > > >  }
> > > > > > > > 
> > > > > > > > +int intel_render_ring_init_dri(struct drm_device *dev, u64
> > > > > > > > start, u32 size) +{
> > > > > > > > +       drm_i915_private_t *dev_priv = dev->dev_private;
> > > > > > > > +       struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> > > > > > > > +
> > > > > > > > +       *ring = render_ring;
> > > > > > > > 
> > > > > > >           ^^^^^^^^^^^^^^^^^^^
> > > > > > >           here resets
> > > > > > > > 
> > > > > > > > +       if (INTEL_INFO(dev)->gen >= 6) {
> > > > > > > > +               ring->add_request = gen6_add_request;
> > > > > > > > +               ring->irq_get = gen6_render_ring_get_irq;
> > > > > > > > +               ring->irq_put = gen6_render_ring_put_irq;
> > > > > > > > +       } else if (IS_GEN5(dev)) {
> > > > > > > > +               ring->add_request = pc_render_add_request;
> > > > > > > > +               ring->get_seqno = pc_render_get_seqno;
> > > > > > > > +       }
> > > > > > > 
> > > > > > > and then the rest of the `ring` is initialized seemingly
> > > > > > > copy-pasted
> > > > > > > 
> > > > > > > from intel_init_ring_buffer():
> > > > > > > > +       ring->dev = dev;
> > > > > > > > +       INIT_LIST_HEAD(&ring->active_list);
> > > > > > > > +       INIT_LIST_HEAD(&ring->request_list);
> > > > > > > > +       INIT_LIST_HEAD(&ring->gpu_write_list);
> > > > > > > > +
> > > > > > > > +       ring->size = size;
> > > > > > > > +       ring->effective_size = ring->size;
> > > > > > > > +       if (IS_I830(ring->dev))
> > > > > > > > +               ring->effective_size -= 128;
> > > > > > > > +
> > > > > > > > +       ring->map.offset = start;
> > > > > > > > +       ring->map.size = size;
> > > > > > > > +       ring->map.type = 0;
> > > > > > > > +       ring->map.flags = 0;
> > > > > > > > +       ring->map.mtrr = 0;
> > > > > > > 
> > > > > > > ...
> > > > > > > 
> > > > > > > where both 3 chunks go almost exactly from
> > > > > > > intel_init_ring_buffer(), and ring->effective_size tweak even
> > > > > > > stripped original comment:
> > > > > > > 
> > > > > > > # original version from intel_init_ring_buffer():
> > > > > > >         /* Workaround an erratum on the i830 which causes a hang
> > > > > > >         if
> > > > > > >         
> > > > > > >          * the TAIL pointer points to within the last 2
> > > > > > >          cachelines * of the buffer.
> > > > > > >          */
> > > > > > >         
> > > > > > >         ring->effective_size = ring->size;
> > > > > > >         if (IS_I830(ring->dev))
> > > > > > >         
> > > > > > >                 ring->effective_size -= 128;
> > > > > > > 
> > > > > > > ...
> > > > > > > 
> > > > > > > 
> > > > > > > The line marked "here resets" resets all the fields, and maybe
> > > > > > > it's not a good idea to re-initialize them all afterwards
> > > > > > > (missing some as this thread show), or at least if it is really
> > > > > > > needed, share initialization code between
> > > > > > > intel_render_ring_init_dri() and intel_init_ring_buffer() ?
> > > > > > > 
> > > > > > > >From the outside it looks like the offending patch was done as a
> > > > > > > >quick
> > > > > > > 
> > > > > > > fix in a hurry (lots of copy-paste), and maybe it would be better
> > > > > > > to re-do it properly...
> > > > > > 
> > > > > > Silence... ?
> > > > > > 
> > > > > > I read UMS is still ignored, because e.g. that uninitialized
> > > > > > ring->irq_lock which I've wrote about above is for sure used e.g.
> > > > > > in gen6_render_ring_get_irq() added to ring vtable in
> > > > > > intel_render_ring_init_dri().
> > > > > 
> > > > > I really doubt that UMS supports gen6 hardware.
> > > > 
> > > > Then why it is there in intel_render_ring_init_dri():
> > > >     int intel_render_ring_init_dri(struct drm_device *dev, u64 start,
> > > >     u32
> > > > 
> > > > size) {
> > > > 
> > > >     	drm_i915_private_t *dev_priv = dev->dev_private;
> > > >     	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> > > >     	
> > > >     	*ring = render_ring;
> > > >     	if (INTEL_INFO(dev)->gen >= 6) {
> > > 
> > > This branch executes only when hw generation is 6 or newer.
> > 
> > and adds gen6_render_ring_get_irq() to vtable which uses ring->irq_lock
> > which is left uninitialized.
> > 
> > I don't understand what you were trying to say. How does it matter if
> > some branch executes only for such-and-such hardware, when this branch
> > contains bugs? Could you please clarify?
> 
> I want to say that xf86-video-intel with gen6 support does not support UMS. So 
> you can't even hit this "bug".


Ok, but so then there is a dead code in the kernel, right? Or not dead
at all because potentially some non-X userspace could trigger the bug.

Why it was added in the first place?


To me, intel_render_ring_init_dri() looks like being copy-pasted from
several places in a hurry. And I was already beaten by one bug
introduced in it, without a single response for 3 kernel cycles though
I've asked for help several times and provided detailed info.

Finally Keith analyzed and plugged NULL-pointer dereference (thanks)
but I'm telling, it seems there are more bugs introduced in e8616b6c.

The patch title says "drm/i915: Initialise ring vfuncs for old DRI
paths" and one could ask, why couldn't it be done without bugs and
regressions. Are we waiting for another one hitting left bugs instead of
fix them in the first place?

Quite frankly, I don't understand intel-gfx developers attitude: why is
it me, just random user who is nitpicking here? Why there is no
interest/will to analyze now obviously buggy/duplicate code and fix it?


If support for UMS/old-dri/whatever is dropped, could you please say so
and clean the driver from legacy code and move on. That would be at
least fair for people not hoping their old setups will continue to
work.


Thanks,
Kirill

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Intel-gfx] Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-08-09 16:32                                     ` [Intel-gfx] " Kirill Smelkov
@ 2011-08-09 16:56                                       ` Ray Lee
  2011-08-09 17:40                                         ` Kirill Smelkov
  0 siblings, 1 reply; 36+ messages in thread
From: Ray Lee @ 2011-08-09 16:56 UTC (permalink / raw)
  To: Kirill Smelkov
  Cc: Vasily Khoruzhick, intel-gfx, Keith Packard, Rafael J. Wysocki,
	Herbert Xu, Luke-Jr, LKML, dri-devel, Pekka Enberg,
	Andrew Morton, Linus Torvalds

On Tue, Aug 9, 2011 at 9:32 AM, Kirill Smelkov <kirr@mns.spb.ru> wrote:
> Quite frankly, I don't understand intel-gfx developers attitude: why is
> it me, just random user who is nitpicking here? Why there is no
> interest/will to analyze now obviously buggy/duplicate code and fix it?

Because they don't have an infinite amount of manpower. Actual bugs
hitting actual users take precedence over 'cleanups' which always have
a chance of causing regressions, as you're well aware. Code churn for
the sake of abstract prettiness is discouraged, as it has a potential
cost for little potential gain.

If you like, submit a patch. You may now be more up-to-date on those
particular code paths than most of the intel-gfx developers.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-08-09 16:56                                       ` Ray Lee
@ 2011-08-09 17:40                                         ` Kirill Smelkov
  2011-08-09 17:43                                           ` [Intel-gfx] " Ray Lee
  2011-08-10  9:41                                           ` [Intel-gfx] " Alan Cox
  0 siblings, 2 replies; 36+ messages in thread
From: Kirill Smelkov @ 2011-08-09 17:40 UTC (permalink / raw)
  To: Ray Lee
  Cc: Rafael J. Wysocki, Herbert Xu, Luke-Jr, intel-gfx, LKML,
	dri-devel, Andrew Morton, Linus Torvalds, Pekka Enberg

On Tue, Aug 09, 2011 at 09:56:01AM -0700, Ray Lee wrote:
> On Tue, Aug 9, 2011 at 9:32 AM, Kirill Smelkov <kirr@mns.spb.ru> wrote:
> > Quite frankly, I don't understand intel-gfx developers attitude: why is
> > it me, just random user who is nitpicking here? Why there is no
> > interest/will to analyze now obviously buggy/duplicate code and fix it?
> 
> Because they don't have an infinite amount of manpower. Actual bugs
> hitting actual users take precedence over 'cleanups' which always have
> a chance of causing regressions, as you're well aware. Code churn for
> the sake of abstract prettiness is discouraged, as it has a potential
> cost for little potential gain.
> 
> If you like, submit a patch. You may now be more up-to-date on those
> particular code paths than most of the intel-gfx developers.

Ray, I'd agree with you if the topic was about cleanups.

But here I was talking about copy-pasty commit which introduced
regressions and bugs, and if now it's a user dilemma to either "clean up"
it after developers himself, or accept that something is broken because
developers lack manpower and so plug things in a hurry increasing
entropy, I'd like to remind a good rule, at least to me one more time,
not to break things in the first place.

I'm not talking about cleanup here. I'm talking about original commit
which introduced problems, and that there is no need to clean it up, but
better revert and redo properly to avoid subsequent code churn in lots
of fixes.


Sorry, I won't submit a patch. If there is a need to find/fix/cleanup
obvious things after company's developers, I have better things to do,
and a todo item to re-evaluate hardware for my next project.


Thanks,
Kirill

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Intel-gfx] Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-08-09 17:40                                         ` Kirill Smelkov
@ 2011-08-09 17:43                                           ` Ray Lee
  2011-08-10  8:36                                             ` Kirill Smelkov
  2011-08-10  9:41                                           ` [Intel-gfx] " Alan Cox
  1 sibling, 1 reply; 36+ messages in thread
From: Ray Lee @ 2011-08-09 17:43 UTC (permalink / raw)
  To: Kirill Smelkov
  Cc: Vasily Khoruzhick, intel-gfx, Keith Packard, Rafael J. Wysocki,
	Herbert Xu, Luke-Jr, LKML, dri-devel, Pekka Enberg,
	Andrew Morton, Linus Torvalds

On Tue, Aug 9, 2011 at 10:40 AM, Kirill Smelkov <kirr@mns.spb.ru> wrote:
>> If you like, submit a patch. You may now be more up-to-date on those
>> particular code paths than most of the intel-gfx developers.
>
> Ray, I'd agree with you if the topic was about cleanups.

At this point it is about cleanups unless Keith's patch upthread does
not work for you. Does it or not?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-08-09 17:43                                           ` [Intel-gfx] " Ray Lee
@ 2011-08-10  8:36                                             ` Kirill Smelkov
  0 siblings, 0 replies; 36+ messages in thread
From: Kirill Smelkov @ 2011-08-10  8:36 UTC (permalink / raw)
  To: Ray Lee
  Cc: Rafael J. Wysocki, Herbert Xu, Luke-Jr, intel-gfx, LKML,
	dri-devel, Andrew Morton, Linus Torvalds, Pekka Enberg

On Tue, Aug 09, 2011 at 10:43:08AM -0700, Ray Lee wrote:
> On Tue, Aug 9, 2011 at 10:40 AM, Kirill Smelkov <kirr@mns.spb.ru> wrote:
> >> If you like, submit a patch. You may now be more up-to-date on those
> >> particular code paths than most of the intel-gfx developers.
> >
> > Ray, I'd agree with you if the topic was about cleanups.
> 
> At this point it is about cleanups unless Keith's patch upthread does
> not work for you. Does it or not?

I've already wrote two weeks ago it does, but if this needs to be
restated one more time here it is: Keith's patch fixes the problem in a
sense that X now starts and seemingly works (thanks), but several issues
remain to be there imho. I've got the message, if it's ok for intel-gfx
to leave them as is - it's ok for me.


Peace,
Kirill

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [Intel-gfx] Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-08-09 17:40                                         ` Kirill Smelkov
  2011-08-09 17:43                                           ` [Intel-gfx] " Ray Lee
@ 2011-08-10  9:41                                           ` Alan Cox
  2011-08-10 11:37                                             ` Kirill Smelkov
  1 sibling, 1 reply; 36+ messages in thread
From: Alan Cox @ 2011-08-10  9:41 UTC (permalink / raw)
  To: Kirill Smelkov
  Cc: Vasily Khoruzhick, Herbert Xu, Ray Lee, intel-gfx, LKML,
	dri-devel, Rafael J. Wysocki, Luke-Jr, Andrew Morton,
	Linus Torvalds, Pekka Enberg

> Sorry, I won't submit a patch. If there is a need to find/fix/cleanup
> obvious things after company's developers, I have better things to do,
> and a todo item to re-evaluate hardware for my next project.

You seem confused. If you have a support contract of some form with a
Linux supplier or Intel please contact your support. This mailing list
isn't for support services.

Alan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
  2011-08-10  9:41                                           ` [Intel-gfx] " Alan Cox
@ 2011-08-10 11:37                                             ` Kirill Smelkov
  0 siblings, 0 replies; 36+ messages in thread
From: Kirill Smelkov @ 2011-08-10 11:37 UTC (permalink / raw)
  To: Alan Cox
  Cc: Herbert Xu, Ray Lee, intel-gfx, LKML, dri-devel,
	Rafael J. Wysocki, Luke-Jr, Andrew Morton, Linus Torvalds,
	Pekka Enberg

On Wed, Aug 10, 2011 at 10:41:44AM +0100, Alan Cox wrote:
> > Sorry, I won't submit a patch. If there is a need to find/fix/cleanup
> > obvious things after company's developers, I have better things to do,
> > and a todo item to re-evaluate hardware for my next project.
> 
> You seem confused. If you have a support contract of some form with a
> Linux supplier or Intel please contact your support. This mailing list
> isn't for support services.

Thanks for clarifying.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2011-08-10 11:37 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-20 17:06 Major 2.6.38 regression ignored? Luke-Jr
2011-05-20 18:08 ` Ray Lee
2011-05-20 20:24   ` Rafael J. Wysocki
2011-05-20 21:11     ` Ray Lee
2011-05-21  8:41   ` Chris Wilson
2011-05-21 15:23     ` Luke-Jr
2011-05-21 15:40       ` Chris Wilson
2011-05-21 19:33         ` Luke-Jr
2011-05-28 13:19         ` Major 2.6.38 / 2.6.39 " Kirill Smelkov
2011-07-12 17:17           ` [Intel-gfx] " Kirill Smelkov
2011-07-12 18:07             ` Pekka Enberg
     [not found]               ` <CA+55aFxakA2U+oMJ1T7awTYa+p6xp9N0aCbfrUqgkF7BJ8gnQw@mail.gmail.com>
2011-07-22 11:08                 ` Major 2.6.38 / 2.6.39 / 3.0 " Kirill Smelkov
2011-07-22 14:12                   ` Herbert Xu
2011-07-22 18:00                   ` Keith Packard
2011-07-22 20:23                     ` Kirill Smelkov
2011-07-22 20:50                       ` Keith Packard
2011-07-22 21:08                         ` Kirill Smelkov
2011-07-22 21:31                           ` Kirill Smelkov
2011-07-23 15:10                             ` [Intel-gfx] " Alex Deucher
2011-07-23 18:19                               ` Kirill Smelkov
2011-07-23 15:55                         ` Pekka Enberg
2011-07-25  4:29                           ` Keith Packard
2011-07-26 13:48                       ` [Intel-gfx] " Kirill Smelkov
2011-08-09 12:08                         ` Kirill Smelkov
2011-08-09 14:00                           ` Vasily Khoruzhick
2011-08-09 14:47                             ` Kirill Smelkov
2011-08-09 15:09                               ` Vasily Khoruzhick
2011-08-09 15:34                                 ` Kirill Smelkov
2011-08-09 16:02                                   ` Vasily Khoruzhick
2011-08-09 16:32                                     ` [Intel-gfx] " Kirill Smelkov
2011-08-09 16:56                                       ` Ray Lee
2011-08-09 17:40                                         ` Kirill Smelkov
2011-08-09 17:43                                           ` [Intel-gfx] " Ray Lee
2011-08-10  8:36                                             ` Kirill Smelkov
2011-08-10  9:41                                           ` [Intel-gfx] " Alan Cox
2011-08-10 11:37                                             ` Kirill Smelkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).