All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kirill Smelkov <kirr@mns.spb.ru>
To: Vasily Khoruzhick <anarsoul@gmail.com>
Cc: intel-gfx@lists.freedesktop.org,
	Keith Packard <keithp@keithp.com>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Herbert Xu <herbert@gondor.hengli.com.au>,
	Luke-Jr <luke@dashjr.org>, LKML <linux-kernel@vger.kernel.org>,
	dri-devel@lists.freedesktop.org,
	Pekka Enberg <penberg@kernel.org>, Ray Lee <ray-lk@madrabbit.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [Intel-gfx] Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
Date: Tue, 9 Aug 2011 19:34:46 +0400	[thread overview]
Message-ID: <20110809153446.GA22077@tugrik.mns.mnsspb.ru> (raw)
In-Reply-To: <201108091809.58425.anarsoul@gmail.com>

On Tue, Aug 09, 2011 at 06:09:57PM +0300, Vasily Khoruzhick wrote:
> On Tuesday 09 August 2011 17:47:56 Kirill Smelkov wrote:
> > On Tue, Aug 09, 2011 at 05:00:52PM +0300, Vasily Khoruzhick wrote:
> > > On Tuesday 09 August 2011 15:08:03 Kirill Smelkov wrote:
> > > > On Tue, Jul 26, 2011 at 05:48:27PM +0400, Kirill Smelkov wrote:
> > > > > On Sat, Jul 23, 2011 at 12:23:36AM +0400, Kirill Smelkov wrote:
> > > > > > Keith,
> > > > > > 
> > > > > > first of all thanks for your prompt reply. Then...
> > > > > > 
> > > > > > On Fri, Jul 22, 2011 at 11:00:41AM -0700, Keith Packard wrote:
> > > > > > > On Fri, 22 Jul 2011 15:08:06 +0400, Kirill Smelkov
> > > > > > > <kirr@mns.spb.ru>
> > > 
> > > wrote:
> > > > > > > > And now after v3.0 is out, I've tested it again, and yes, like
> > > > > > > > it was broken on v3.0-rc5, it is (now even more) broken on
> > > > > > > > v3.0 -- after first
> > > > > > > 
> > > > > > > > bad io access the system freezes completely:
> > > > > > > I looked at this when I first saw it (a couple of weeks ago), and
> > > > > > > I couldn't see any obvious reason this patch would cause this
> > > > > > > particular problem. I didn't want to revert the patch at that
> > > > > > > point as I feared it would cause other subtle problems. Given
> > > > > > > that you've got a work-around, it seemed best to just push this
> > > > > > > off past 3.0.
> > > > > > 
> > > > > > What kind of a workaround are you talking about? Sorry, to me it
> > > > > > all looked like "UMS is being ignored forever". Anyway, let's move
> > > > > > on to try to solve the issue.
> > > > > > 
> > > > > > > Given the failing address passed to ioread32, this seems like
> > > > > > > it's probably the call to READ_BREADCRUMB --
> > > > > > > I915_BREADCRUMB_INDEX is 0x21, which is an offset in 32-bit
> > > > > > > units within the hardware status page. If the
> > > > > > > status_page.page_addr value was zero, then the computed address
> > > > > > > would end up being 0x84.
> > > > > > > 
> > > > > > > And, it looks like status_page.page_addr *will* end up being zero
> > > > > > > as a result of the patch in question. The patch resets the
> > > > > > > entire ring structure contents back to the initial values, which
> > > > > > > includes smashing the status_page structure to zero, clearing
> > > > > > > the value of status_page.page_addr set in i915_init_phys_hws.
> > > > > > > 
> > > > > > > Here's an untested patch which moves the initialization of
> > > > > > > status_page.page_addr into intel_render_ring_init_dri. I note
> > > > > > > that intel_init_render_ring_buffer *already* has the setting of
> > > > > > > the status_page.page_addr value, and so I've removed the setting
> > > > > > > of status_page.page_addr from i915_init_phys_hws.
> > > > > > > 
> > > > > > > I suspect we could remove the memset from
> > > > > > > intel_init_render_ring_buffer; it seems entirely superfluous
> > > > > > > given the memset in i915_init_phys_hws.
> > > > > > > 
> > > > > > > From 159ba1dd207fc52590ce8a3afd83f40bd2cedf46 Mon Sep 17 00:00:00
> > > > > > > 2001 From: Keith Packard <keithp@keithp.com>
> > > > > > > Date: Fri, 22 Jul 2011 10:44:39 -0700
> > > > > > > Subject: [PATCH] drm/i915: Initialize RCS ring status page
> > > > > > > address in
> > > > > > > 
> > > > > > >  intel_render_ring_init_dri
> > > > > > > 
> > > > > > > Physically-addressed hardware status pages are initialized early
> > > > > > > in the driver load process by i915_init_phys_hws. For UMS
> > > > > > > environments, the ring structure is not initialized until the X
> > > > > > > server starts. At that point, the entire ring structure is
> > > > > > > re-initialized with all new values. Any values set in the ring
> > > > > > > structure (including
> > > > > > > ring->status_page.page_addr) will be lost when the ring is
> > > > > > > re-initialized.
> > > > > > > 
> > > > > > > This patch moves the initialization of the status_page.page_addr
> > > > > > > value to intel_render_ring_init_dri.
> > > > > > > 
> > > > > > > Signed-off-by: Keith Packard <keithp@keithp.com>
> > > > > > > ---
> > > > > > > 
> > > > > > >  drivers/gpu/drm/i915/i915_dma.c         |    6 ++----
> > > > > > >  drivers/gpu/drm/i915/intel_ringbuffer.c |    3 +++
> > > > > > >  2 files changed, 5 insertions(+), 4 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > b/drivers/gpu/drm/i915/i915_dma.c index 1271282..8a3942c 100644
> > > > > > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > @@ -61,7 +61,6 @@ static void i915_write_hws_pga(struct
> > > > > > > drm_device *dev)
> > > > > > > 
> > > > > > >  static int i915_init_phys_hws(struct drm_device *dev)
> > > > > > >  {
> > > > > > >  
> > > > > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > > > > 
> > > > > > > -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
> > > > > > > 
> > > > > > >  	/* Program Hardware Status Page */
> > > > > > >  	dev_priv->status_page_dmah =
> > > > > > > 
> > > > > > > @@ -71,10 +70,9 @@ static int i915_init_phys_hws(struct
> > > > > > > drm_device *dev)
> > > > > > > 
> > > > > > >  		DRM_ERROR("Can not allocate hardware status page\n");
> > > > > > >  		return -ENOMEM;
> > > > > > >  	
> > > > > > >  	}
> > > > > > > 
> > > > > > > -	ring->status_page.page_addr =
> > > > > > > -		(void __force __iomem *)dev_priv->status_page_dmah->vaddr;
> > > > > > > 
> > > > > > > -	memset_io(ring->status_page.page_addr, 0, PAGE_SIZE);
> > > > > > > +	memset_io((void __force __iomem
> > > > > > > *)dev_priv->status_page_dmah->vaddr, +		  0, PAGE_SIZE);
> > > > > > > 
> > > > > > >  	i915_write_hws_pga(dev);
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index e961568..47b9b27
> > > > > > > 100644
> > > > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > @@ -1321,6 +1321,9 @@ int intel_render_ring_init_dri(struct
> > > > > > > drm_device *dev, u64 start, u32 size)
> > > > > > > 
> > > > > > >  		ring->get_seqno = pc_render_get_seqno;
> > > > > > >  	
> > > > > > >  	}
> > > > > > > 
> > > > > > > +	if (!I915_NEED_GFX_HWS(dev))
> > > > > > > +		ring->status_page.page_addr =
> > > > > > > dev_priv->status_page_dmah->vaddr; +
> > > > > > > 
> > > > > > >  	ring->dev = dev;
> > > > > > >  	INIT_LIST_HEAD(&ring->active_list);
> > > > > > >  	INIT_LIST_HEAD(&ring->request_list);
> > > > > > 
> > > > > > I can't tell whether this is correct, because intel gfx driver is
> > > > > > unknown to me, but from the first glance your description sounds
> > > > > > reasonable.
> > > > > > 
> > > > > > I'm out of office till ~ next week's tuesday, and on return I'll
> > > > > > try to test it on the hardware in question.
> > > > > 
> > > > > Keith, thanks again for the patch. As promised I've tested it on the
> > > > > hardware in question and yes, bad_access is gone and X seems to work,
> > > > > so thank you, but...
> > > > > 
> > > > > 
> > > > > I see there are more such bugs in introduced-in-guilty-patch
> > > > > intel_render_ring_init_dri(). For example ring->irq_queue is
> > > > > left uninitialized and also ring->irq_lock etc...
> > > > > 
> > > > > 
> > > > > I'm X newbie, so if here is something stupid X-wise, please don't
> > > > > beat me too hard, but to me the gist of the problem is the original
> > > > > patch, where Chris does
> > > > > 
> > > > > ( git show e8616b6ced6137085e6657cc63bc2fe3900b8616 )
> > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index 03e3370..51fbc5e
> > > > > > 100644
> > > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > @@ -1291,6 +1291,48 @@ int intel_init_render_ring_buffer(struct
> > > > > > drm_device *dev)
> > > > > > 
> > > > > >         return intel_init_ring_buffer(dev, ring);
> > > > > >  
> > > > > >  }
> > > > > > 
> > > > > > +int intel_render_ring_init_dri(struct drm_device *dev, u64 start,
> > > > > > u32 size) +{
> > > > > > +       drm_i915_private_t *dev_priv = dev->dev_private;
> > > > > > +       struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> > > > > > +
> > > > > > +       *ring = render_ring;
> > > > > > 
> > > > >           ^^^^^^^^^^^^^^^^^^^
> > > > >           here resets
> > > > > > 
> > > > > > +       if (INTEL_INFO(dev)->gen >= 6) {
> > > > > > +               ring->add_request = gen6_add_request;
> > > > > > +               ring->irq_get = gen6_render_ring_get_irq;
> > > > > > +               ring->irq_put = gen6_render_ring_put_irq;
> > > > > > +       } else if (IS_GEN5(dev)) {
> > > > > > +               ring->add_request = pc_render_add_request;
> > > > > > +               ring->get_seqno = pc_render_get_seqno;
> > > > > > +       }
> > > > > 
> > > > > and then the rest of the `ring` is initialized seemingly copy-pasted
> > > > > 
> > > > > from intel_init_ring_buffer():
> > > > > > +       ring->dev = dev;
> > > > > > +       INIT_LIST_HEAD(&ring->active_list);
> > > > > > +       INIT_LIST_HEAD(&ring->request_list);
> > > > > > +       INIT_LIST_HEAD(&ring->gpu_write_list);
> > > > > > +
> > > > > > +       ring->size = size;
> > > > > > +       ring->effective_size = ring->size;
> > > > > > +       if (IS_I830(ring->dev))
> > > > > > +               ring->effective_size -= 128;
> > > > > > +
> > > > > > +       ring->map.offset = start;
> > > > > > +       ring->map.size = size;
> > > > > > +       ring->map.type = 0;
> > > > > > +       ring->map.flags = 0;
> > > > > > +       ring->map.mtrr = 0;
> > > > > 
> > > > > ...
> > > > > 
> > > > > where both 3 chunks go almost exactly from intel_init_ring_buffer(),
> > > > > and ring->effective_size tweak even stripped original comment:
> > > > > 
> > > > > # original version from intel_init_ring_buffer():
> > > > >         /* Workaround an erratum on the i830 which causes a hang if
> > > > >         
> > > > >          * the TAIL pointer points to within the last 2 cachelines
> > > > >          * of the buffer.
> > > > >          */
> > > > >         
> > > > >         ring->effective_size = ring->size;
> > > > >         if (IS_I830(ring->dev))
> > > > >         
> > > > >                 ring->effective_size -= 128;
> > > > > 
> > > > > ...
> > > > > 
> > > > > 
> > > > > The line marked "here resets" resets all the fields, and maybe it's
> > > > > not a good idea to re-initialize them all afterwards (missing some
> > > > > as this thread show), or at least if it is really needed, share
> > > > > initialization code between intel_render_ring_init_dri() and
> > > > > intel_init_ring_buffer() ?
> > > > > 
> > > > > >From the outside it looks like the offending patch was done as a
> > > > > >quick
> > > > > 
> > > > > fix in a hurry (lots of copy-paste), and maybe it would be better to
> > > > > re-do it properly...
> > > > 
> > > > Silence... ?
> > > > 
> > > > I read UMS is still ignored, because e.g. that uninitialized
> > > > ring->irq_lock which I've wrote about above is for sure used e.g. in
> > > > gen6_render_ring_get_irq() added to ring vtable in
> > > > intel_render_ring_init_dri().
> > > 
> > > I really doubt that UMS supports gen6 hardware.
> > 
> > Then why it is there in intel_render_ring_init_dri():
> > 
> >     int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32
> > size) {
> >     	drm_i915_private_t *dev_priv = dev->dev_private;
> >     	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> > 
> >     	*ring = render_ring;
> >     	if (INTEL_INFO(dev)->gen >= 6) {
> 
> This branch executes only when hw generation is 6 or newer.

and adds gen6_render_ring_get_irq() to vtable which uses ring->irq_lock
which is left uninitialized.

I don't understand what you were trying to say. How does it matter if
some branch executes only for such-and-such hardware, when this branch
contains bugs? Could you please clarify?


> >     		ring->add_request = gen6_add_request;
> >     		ring->irq_get = gen6_render_ring_get_irq;
> >     		ring->irq_put = gen6_render_ring_put_irq;
> >     	} else if (IS_GEN5(dev)) {
> >     		ring->add_request = pc_render_add_request;
> >     		ring->get_seqno = pc_render_get_seqno;
> >     	}

WARNING: multiple messages have this Message-ID (diff)
From: Kirill Smelkov <kirr@mns.spb.ru>
To: Vasily Khoruzhick <anarsoul@gmail.com>
Cc: Pekka Enberg <penberg@kernel.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Luke-Jr <luke@dashjr.org>,
	intel-gfx@lists.freedesktop.org,
	LKML <linux-kernel@vger.kernel.org>,
	dri-devel@lists.freedesktop.org,
	"Rafael J. Wysocki" <rjw@sisk.pl>, Ray Lee <ray-lk@madrabbit.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: Major 2.6.38 / 2.6.39 / 3.0 regression ignored?
Date: Tue, 9 Aug 2011 19:34:46 +0400	[thread overview]
Message-ID: <20110809153446.GA22077@tugrik.mns.mnsspb.ru> (raw)
In-Reply-To: <201108091809.58425.anarsoul@gmail.com>

On Tue, Aug 09, 2011 at 06:09:57PM +0300, Vasily Khoruzhick wrote:
> On Tuesday 09 August 2011 17:47:56 Kirill Smelkov wrote:
> > On Tue, Aug 09, 2011 at 05:00:52PM +0300, Vasily Khoruzhick wrote:
> > > On Tuesday 09 August 2011 15:08:03 Kirill Smelkov wrote:
> > > > On Tue, Jul 26, 2011 at 05:48:27PM +0400, Kirill Smelkov wrote:
> > > > > On Sat, Jul 23, 2011 at 12:23:36AM +0400, Kirill Smelkov wrote:
> > > > > > Keith,
> > > > > > 
> > > > > > first of all thanks for your prompt reply. Then...
> > > > > > 
> > > > > > On Fri, Jul 22, 2011 at 11:00:41AM -0700, Keith Packard wrote:
> > > > > > > On Fri, 22 Jul 2011 15:08:06 +0400, Kirill Smelkov
> > > > > > > <kirr@mns.spb.ru>
> > > 
> > > wrote:
> > > > > > > > And now after v3.0 is out, I've tested it again, and yes, like
> > > > > > > > it was broken on v3.0-rc5, it is (now even more) broken on
> > > > > > > > v3.0 -- after first
> > > > > > > 
> > > > > > > > bad io access the system freezes completely:
> > > > > > > I looked at this when I first saw it (a couple of weeks ago), and
> > > > > > > I couldn't see any obvious reason this patch would cause this
> > > > > > > particular problem. I didn't want to revert the patch at that
> > > > > > > point as I feared it would cause other subtle problems. Given
> > > > > > > that you've got a work-around, it seemed best to just push this
> > > > > > > off past 3.0.
> > > > > > 
> > > > > > What kind of a workaround are you talking about? Sorry, to me it
> > > > > > all looked like "UMS is being ignored forever". Anyway, let's move
> > > > > > on to try to solve the issue.
> > > > > > 
> > > > > > > Given the failing address passed to ioread32, this seems like
> > > > > > > it's probably the call to READ_BREADCRUMB --
> > > > > > > I915_BREADCRUMB_INDEX is 0x21, which is an offset in 32-bit
> > > > > > > units within the hardware status page. If the
> > > > > > > status_page.page_addr value was zero, then the computed address
> > > > > > > would end up being 0x84.
> > > > > > > 
> > > > > > > And, it looks like status_page.page_addr *will* end up being zero
> > > > > > > as a result of the patch in question. The patch resets the
> > > > > > > entire ring structure contents back to the initial values, which
> > > > > > > includes smashing the status_page structure to zero, clearing
> > > > > > > the value of status_page.page_addr set in i915_init_phys_hws.
> > > > > > > 
> > > > > > > Here's an untested patch which moves the initialization of
> > > > > > > status_page.page_addr into intel_render_ring_init_dri. I note
> > > > > > > that intel_init_render_ring_buffer *already* has the setting of
> > > > > > > the status_page.page_addr value, and so I've removed the setting
> > > > > > > of status_page.page_addr from i915_init_phys_hws.
> > > > > > > 
> > > > > > > I suspect we could remove the memset from
> > > > > > > intel_init_render_ring_buffer; it seems entirely superfluous
> > > > > > > given the memset in i915_init_phys_hws.
> > > > > > > 
> > > > > > > From 159ba1dd207fc52590ce8a3afd83f40bd2cedf46 Mon Sep 17 00:00:00
> > > > > > > 2001 From: Keith Packard <keithp@keithp.com>
> > > > > > > Date: Fri, 22 Jul 2011 10:44:39 -0700
> > > > > > > Subject: [PATCH] drm/i915: Initialize RCS ring status page
> > > > > > > address in
> > > > > > > 
> > > > > > >  intel_render_ring_init_dri
> > > > > > > 
> > > > > > > Physically-addressed hardware status pages are initialized early
> > > > > > > in the driver load process by i915_init_phys_hws. For UMS
> > > > > > > environments, the ring structure is not initialized until the X
> > > > > > > server starts. At that point, the entire ring structure is
> > > > > > > re-initialized with all new values. Any values set in the ring
> > > > > > > structure (including
> > > > > > > ring->status_page.page_addr) will be lost when the ring is
> > > > > > > re-initialized.
> > > > > > > 
> > > > > > > This patch moves the initialization of the status_page.page_addr
> > > > > > > value to intel_render_ring_init_dri.
> > > > > > > 
> > > > > > > Signed-off-by: Keith Packard <keithp@keithp.com>
> > > > > > > ---
> > > > > > > 
> > > > > > >  drivers/gpu/drm/i915/i915_dma.c         |    6 ++----
> > > > > > >  drivers/gpu/drm/i915/intel_ringbuffer.c |    3 +++
> > > > > > >  2 files changed, 5 insertions(+), 4 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > b/drivers/gpu/drm/i915/i915_dma.c index 1271282..8a3942c 100644
> > > > > > > --- a/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > > > > > > @@ -61,7 +61,6 @@ static void i915_write_hws_pga(struct
> > > > > > > drm_device *dev)
> > > > > > > 
> > > > > > >  static int i915_init_phys_hws(struct drm_device *dev)
> > > > > > >  {
> > > > > > >  
> > > > > > >  	drm_i915_private_t *dev_priv = dev->dev_private;
> > > > > > > 
> > > > > > > -	struct intel_ring_buffer *ring = LP_RING(dev_priv);
> > > > > > > 
> > > > > > >  	/* Program Hardware Status Page */
> > > > > > >  	dev_priv->status_page_dmah =
> > > > > > > 
> > > > > > > @@ -71,10 +70,9 @@ static int i915_init_phys_hws(struct
> > > > > > > drm_device *dev)
> > > > > > > 
> > > > > > >  		DRM_ERROR("Can not allocate hardware status page\n");
> > > > > > >  		return -ENOMEM;
> > > > > > >  	
> > > > > > >  	}
> > > > > > > 
> > > > > > > -	ring->status_page.page_addr =
> > > > > > > -		(void __force __iomem *)dev_priv->status_page_dmah->vaddr;
> > > > > > > 
> > > > > > > -	memset_io(ring->status_page.page_addr, 0, PAGE_SIZE);
> > > > > > > +	memset_io((void __force __iomem
> > > > > > > *)dev_priv->status_page_dmah->vaddr, +		  0, PAGE_SIZE);
> > > > > > > 
> > > > > > >  	i915_write_hws_pga(dev);
> > > > > > > 
> > > > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index e961568..47b9b27
> > > > > > > 100644
> > > > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > > @@ -1321,6 +1321,9 @@ int intel_render_ring_init_dri(struct
> > > > > > > drm_device *dev, u64 start, u32 size)
> > > > > > > 
> > > > > > >  		ring->get_seqno = pc_render_get_seqno;
> > > > > > >  	
> > > > > > >  	}
> > > > > > > 
> > > > > > > +	if (!I915_NEED_GFX_HWS(dev))
> > > > > > > +		ring->status_page.page_addr =
> > > > > > > dev_priv->status_page_dmah->vaddr; +
> > > > > > > 
> > > > > > >  	ring->dev = dev;
> > > > > > >  	INIT_LIST_HEAD(&ring->active_list);
> > > > > > >  	INIT_LIST_HEAD(&ring->request_list);
> > > > > > 
> > > > > > I can't tell whether this is correct, because intel gfx driver is
> > > > > > unknown to me, but from the first glance your description sounds
> > > > > > reasonable.
> > > > > > 
> > > > > > I'm out of office till ~ next week's tuesday, and on return I'll
> > > > > > try to test it on the hardware in question.
> > > > > 
> > > > > Keith, thanks again for the patch. As promised I've tested it on the
> > > > > hardware in question and yes, bad_access is gone and X seems to work,
> > > > > so thank you, but...
> > > > > 
> > > > > 
> > > > > I see there are more such bugs in introduced-in-guilty-patch
> > > > > intel_render_ring_init_dri(). For example ring->irq_queue is
> > > > > left uninitialized and also ring->irq_lock etc...
> > > > > 
> > > > > 
> > > > > I'm X newbie, so if here is something stupid X-wise, please don't
> > > > > beat me too hard, but to me the gist of the problem is the original
> > > > > patch, where Chris does
> > > > > 
> > > > > ( git show e8616b6ced6137085e6657cc63bc2fe3900b8616 )
> > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > b/drivers/gpu/drm/i915/intel_ringbuffer.c index 03e3370..51fbc5e
> > > > > > 100644
> > > > > > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > > > > > @@ -1291,6 +1291,48 @@ int intel_init_render_ring_buffer(struct
> > > > > > drm_device *dev)
> > > > > > 
> > > > > >         return intel_init_ring_buffer(dev, ring);
> > > > > >  
> > > > > >  }
> > > > > > 
> > > > > > +int intel_render_ring_init_dri(struct drm_device *dev, u64 start,
> > > > > > u32 size) +{
> > > > > > +       drm_i915_private_t *dev_priv = dev->dev_private;
> > > > > > +       struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> > > > > > +
> > > > > > +       *ring = render_ring;
> > > > > > 
> > > > >           ^^^^^^^^^^^^^^^^^^^
> > > > >           here resets
> > > > > > 
> > > > > > +       if (INTEL_INFO(dev)->gen >= 6) {
> > > > > > +               ring->add_request = gen6_add_request;
> > > > > > +               ring->irq_get = gen6_render_ring_get_irq;
> > > > > > +               ring->irq_put = gen6_render_ring_put_irq;
> > > > > > +       } else if (IS_GEN5(dev)) {
> > > > > > +               ring->add_request = pc_render_add_request;
> > > > > > +               ring->get_seqno = pc_render_get_seqno;
> > > > > > +       }
> > > > > 
> > > > > and then the rest of the `ring` is initialized seemingly copy-pasted
> > > > > 
> > > > > from intel_init_ring_buffer():
> > > > > > +       ring->dev = dev;
> > > > > > +       INIT_LIST_HEAD(&ring->active_list);
> > > > > > +       INIT_LIST_HEAD(&ring->request_list);
> > > > > > +       INIT_LIST_HEAD(&ring->gpu_write_list);
> > > > > > +
> > > > > > +       ring->size = size;
> > > > > > +       ring->effective_size = ring->size;
> > > > > > +       if (IS_I830(ring->dev))
> > > > > > +               ring->effective_size -= 128;
> > > > > > +
> > > > > > +       ring->map.offset = start;
> > > > > > +       ring->map.size = size;
> > > > > > +       ring->map.type = 0;
> > > > > > +       ring->map.flags = 0;
> > > > > > +       ring->map.mtrr = 0;
> > > > > 
> > > > > ...
> > > > > 
> > > > > where both 3 chunks go almost exactly from intel_init_ring_buffer(),
> > > > > and ring->effective_size tweak even stripped original comment:
> > > > > 
> > > > > # original version from intel_init_ring_buffer():
> > > > >         /* Workaround an erratum on the i830 which causes a hang if
> > > > >         
> > > > >          * the TAIL pointer points to within the last 2 cachelines
> > > > >          * of the buffer.
> > > > >          */
> > > > >         
> > > > >         ring->effective_size = ring->size;
> > > > >         if (IS_I830(ring->dev))
> > > > >         
> > > > >                 ring->effective_size -= 128;
> > > > > 
> > > > > ...
> > > > > 
> > > > > 
> > > > > The line marked "here resets" resets all the fields, and maybe it's
> > > > > not a good idea to re-initialize them all afterwards (missing some
> > > > > as this thread show), or at least if it is really needed, share
> > > > > initialization code between intel_render_ring_init_dri() and
> > > > > intel_init_ring_buffer() ?
> > > > > 
> > > > > >From the outside it looks like the offending patch was done as a
> > > > > >quick
> > > > > 
> > > > > fix in a hurry (lots of copy-paste), and maybe it would be better to
> > > > > re-do it properly...
> > > > 
> > > > Silence... ?
> > > > 
> > > > I read UMS is still ignored, because e.g. that uninitialized
> > > > ring->irq_lock which I've wrote about above is for sure used e.g. in
> > > > gen6_render_ring_get_irq() added to ring vtable in
> > > > intel_render_ring_init_dri().
> > > 
> > > I really doubt that UMS supports gen6 hardware.
> > 
> > Then why it is there in intel_render_ring_init_dri():
> > 
> >     int intel_render_ring_init_dri(struct drm_device *dev, u64 start, u32
> > size) {
> >     	drm_i915_private_t *dev_priv = dev->dev_private;
> >     	struct intel_ring_buffer *ring = &dev_priv->ring[RCS];
> > 
> >     	*ring = render_ring;
> >     	if (INTEL_INFO(dev)->gen >= 6) {
> 
> This branch executes only when hw generation is 6 or newer.

and adds gen6_render_ring_get_irq() to vtable which uses ring->irq_lock
which is left uninitialized.

I don't understand what you were trying to say. How does it matter if
some branch executes only for such-and-such hardware, when this branch
contains bugs? Could you please clarify?


> >     		ring->add_request = gen6_add_request;
> >     		ring->irq_get = gen6_render_ring_get_irq;
> >     		ring->irq_put = gen6_render_ring_put_irq;
> >     	} else if (IS_GEN5(dev)) {
> >     		ring->add_request = pc_render_add_request;
> >     		ring->get_seqno = pc_render_get_seqno;
> >     	}

  reply	other threads:[~2011-08-09 15:36 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-20 17:06 Major 2.6.38 regression ignored? Luke-Jr
2011-05-20 18:08 ` Ray Lee
2011-05-20 20:24   ` Rafael J. Wysocki
2011-05-20 21:11     ` Ray Lee
2011-05-21  8:41   ` Chris Wilson
2011-05-21 15:23     ` Luke-Jr
2011-05-21 15:40       ` Chris Wilson
2011-05-21 15:40         ` Chris Wilson
2011-05-21 19:33         ` Luke-Jr
2011-05-21 19:33           ` Luke-Jr
2011-05-28 13:19         ` Major 2.6.38 / 2.6.39 " Kirill Smelkov
2011-07-12 17:17           ` [Intel-gfx] " Kirill Smelkov
2011-07-12 18:07             ` Pekka Enberg
2011-07-12 18:07               ` Pekka Enberg
2011-07-22  2:59               ` Linux 3.0 release Linus Torvalds
2011-07-22 11:08                 ` Major 2.6.38 / 2.6.39 / 3.0 regression ignored? Kirill Smelkov
2011-07-22 11:08                   ` Kirill Smelkov
2011-07-22 14:12                   ` Herbert Xu
2011-07-22 14:12                     ` Herbert Xu
2011-07-22 18:00                   ` Keith Packard
2011-07-22 18:00                     ` Keith Packard
2011-07-22 20:23                     ` Kirill Smelkov
2011-07-22 20:23                       ` Kirill Smelkov
2011-07-22 20:50                       ` Keith Packard
2011-07-22 20:50                         ` Keith Packard
2011-07-22 21:08                         ` Kirill Smelkov
2011-07-22 21:08                           ` Kirill Smelkov
2011-07-22 21:31                           ` [Intel-gfx] " Kirill Smelkov
2011-07-22 21:31                             ` Kirill Smelkov
2011-07-23 15:10                             ` [Intel-gfx] " Alex Deucher
2011-07-23 15:10                               ` Alex Deucher
2011-07-23 18:19                               ` Kirill Smelkov
2011-07-23 18:19                                 ` Kirill Smelkov
2011-07-23 15:55                         ` Pekka Enberg
2011-07-25  4:29                           ` Keith Packard
2011-07-26 13:48                       ` [Intel-gfx] " Kirill Smelkov
2011-07-26 13:48                         ` Kirill Smelkov
2011-08-09 12:08                         ` Kirill Smelkov
2011-08-09 12:08                           ` Kirill Smelkov
2011-08-09 14:00                           ` [Intel-gfx] " Vasily Khoruzhick
2011-08-09 14:00                             ` Vasily Khoruzhick
2011-08-09 14:47                             ` [Intel-gfx] " Kirill Smelkov
2011-08-09 14:47                               ` Kirill Smelkov
2011-08-09 15:09                               ` [Intel-gfx] " Vasily Khoruzhick
2011-08-09 15:09                                 ` Vasily Khoruzhick
2011-08-09 15:34                                 ` Kirill Smelkov [this message]
2011-08-09 15:34                                   ` Kirill Smelkov
2011-08-09 16:02                                   ` [Intel-gfx] " Vasily Khoruzhick
2011-08-09 16:02                                     ` Vasily Khoruzhick
2011-08-09 16:32                                     ` [Intel-gfx] " Kirill Smelkov
2011-08-09 16:32                                       ` Kirill Smelkov
2011-08-09 16:56                                       ` Ray Lee
2011-08-09 16:56                                         ` Ray Lee
2011-08-09 17:40                                         ` Kirill Smelkov
2011-08-09 17:40                                           ` Kirill Smelkov
2011-08-09 17:43                                           ` [Intel-gfx] " Ray Lee
2011-08-09 17:43                                             ` Ray Lee
2011-08-10  8:36                                             ` Kirill Smelkov
2011-08-10  8:36                                               ` Kirill Smelkov
2011-08-10  9:41                                           ` [Intel-gfx] " Alan Cox
2011-08-10  9:41                                             ` Alan Cox
2011-08-10 11:37                                             ` Kirill Smelkov
2011-08-10 11:37                                               ` Kirill Smelkov
2011-07-22 12:52                 ` Linux 3.0 release Martin Knoblauch
2011-07-22 19:11                 ` David
2011-07-22 19:21                   ` Linus Torvalds
2011-07-22 19:44                     ` Ben Greear
2011-07-22 20:32                       ` Stephen Hemminger
2011-07-22 20:35                         ` Linus Torvalds
2011-07-23  2:27                           ` Tejun Heo
2011-07-23  2:30                             ` Tejun Heo
2011-07-22 21:26                         ` Francois Romieu
2011-07-22 22:09                           ` Stephen Hemminger
2011-07-22 22:53                             ` [PATCH] net: allow netif_carrier to be called safely from IRQ Stephen Hemminger
2011-07-23  0:16                               ` David Miller
2011-07-22 23:21                 ` Linux 3.0 release - btrfs possible locking deadlock Ed Tomlinson
2011-07-25 19:49                   ` Chris Mason
2011-07-26  0:22                     ` Ed Tomlinson
2011-07-24 22:04                 ` Linux 3.0 release Arnaud Lacombe
2011-07-25  2:21                   ` Yoshinori Sato
2011-07-25 15:50                     ` Arnaud Lacombe
2011-07-27 15:22                       ` Yoshinori Sato
2011-07-27 17:29                         ` Arnaud Lacombe
2011-07-28  2:08                         ` Arnaud Lacombe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110809153446.GA22077@tugrik.mns.mnsspb.ru \
    --to=kirr@mns.spb.ru \
    --cc=akpm@linux-foundation.org \
    --cc=anarsoul@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=herbert@gondor.hengli.com.au \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=keithp@keithp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luke@dashjr.org \
    --cc=penberg@kernel.org \
    --cc=ray-lk@madrabbit.org \
    --cc=rjw@sisk.pl \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.