linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Deucher <alexdeucher@gmail.com>
To: Josh Boyer <jwboyer@gmail.com>
Cc: Dave Airlie <airlied@linux.ie>,
	Alex Deucher <alexander.deucher@amd.com>,
	Jerome Glisse <jglisse@redhat.com>,
	torvalds@linux-foundation.org, linux-kernel@vger.kernel.org,
	DRI mailing list <dri-devel@lists.freedesktop.org>
Subject: Re: [git pull] drm merge for 3.9-rc1
Date: Thu, 28 Feb 2013 10:09:59 -0500	[thread overview]
Message-ID: <CADnq5_OadSJ1thVEvfQVfTZ3De0vivVjnCKwepVxabs6i6ivLQ@mail.gmail.com> (raw)
In-Reply-To: <CA+5PVA7Yzrx_=k9HB-D5v=7gsmn1ZyNb7YpzwDW-1j35Bbrs4Q@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3404 bytes --]

On Thu, Feb 28, 2013 at 8:44 AM, Josh Boyer <jwboyer@gmail.com> wrote:
> On Thu, Feb 28, 2013 at 8:38 AM, Alex Deucher <alexdeucher@gmail.com> wrote:
>> On Wed, Feb 27, 2013 at 8:14 PM, Josh Boyer <jwboyer@gmail.com> wrote:
>>> On Wed, Feb 27, 2013 at 7:01 PM, Josh Boyer <jwboyer@gmail.com> wrote:
>>>> On Wed, Feb 27, 2013 at 3:20 PM, Josh Boyer <jwboyer@gmail.com> wrote:
>>>>> On Wed, Feb 27, 2013 at 11:34 AM, Josh Boyer <jwboyer@gmail.com> wrote:
>>>>>> On Mon, Feb 25, 2013 at 7:05 PM, Dave Airlie <airlied@linux.ie> wrote:
>>>>>>> Alex Deucher (29):
>>>>>>>       drm/radeon: halt engines before disabling MC (6xx/7xx)
>>>>>>>       drm/radeon: halt engines before disabling MC (evergreen)
>>>>>>>       drm/radeon: halt engines before disabling MC (cayman/TN)
>>>>>>>       drm/radeon: halt engines before disabling MC (si)
>>>>>>>       drm/radeon: use the reset mask to determine if rings are hung
>>>>>>
>>>>>> Something in this series of commits is causing the GPU to hang on reboot
>>>>>> on my Dell XPS 8300 machine.  That has a:
>>>>>>
>>>>>> 01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee
>>>>>> ATI Caicos [Radeon HD 6450]
>>>>>>
>>>>>> card in it.  After reboots, I get a screen that looks like this:
>>>>>>
>>>>>> http://t.co/tPnT6xQZUK
>>>>>>
>>>>>> I can hit it fairly consistently after a few reboots, so I tried doing a
>>>>>> git bisect on the radeon driver and it came down to:
>>>>>>
>>>>>> ca57802e521de54341efc8a56f70571f79ffac72 is the first bad commit
>>>>>
>>>>> So I don't think that's actually the cause of the problem.  Or at least
>>>>> not that alone.  I reverted it on top of Linus' latest tree and I still
>>>>> get the lockups.
>>>>
>>>> Actually, git bisect does seem to have gotten it correct.  Once I
>>>> actually tested the revert of just that on top of Linus' tree (commit
>>>> d895cb1af1), things seem to be working much better.  I've rebooted a
>>>> dozen times without a lockup.  The most I've seen it take on a kernel
>>>> with that commit included is 3 reboots, so that's definitely at least an
>>>> improvement.
>>>
>>> I give up.  GPU issues are not my thing.  2 reboots after I sent that it
>>> gave me pretty rainbow static again.  So it might have been an
>>> improvement, but revert it is not a solution.
>>>
>>> Looking at there rest of the commits, the whole GPU rework might be
>>> suspect, but I clearly have no clue.
>>
>> GPUs are tricky beasts :)
>
> Understatement ;).
>
>> ca57802e521de54341efc8a56f70571f79ffac72 mostly likely wasn't the
>> problem anyway since it only affects 6xx/7xx and your card is handled
>> by the evergreen code.  I'll put together some patches to help narrow
>> down the problem.
>
> Yeah, that's the biggest problem I have, not knowing which functions are
> actually being executed for this card.  It looks like a combination of
> stuff in evergreen.c and ni.c, but I have no idea.
>
> Patches would be great.  If nothing else, I'm really good at building
> kernels and rebooting by now.

Two possible fixes attached.  The first attempts a full reset of all
blocks if the MC (memory controller) is hung.  That may work better
than just resetting the MC.  The second just disables MC reset.  I'm
not sure we can reliably tell if it's busy due to display requests
hitting the MC periodically which would lead to needlessly resetting
it possibly leading to failures like you are seeing.

Alex

[-- Attachment #2: 0001-drm-radeon-XXX-try-a-full-reset-if-the-MC-is-busy.patch --]
[-- Type: text/x-patch, Size: 992 bytes --]

From 9a648b04474ed230601c3c3e816cb281ebaad604 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Thu, 28 Feb 2013 09:56:48 -0500
Subject: [PATCH] drm/radeon: XXX try a full reset if the MC is busy

See if this helps.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/radeon/evergreen.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c
index 3c38ea4..bbcac11 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -2438,6 +2438,12 @@ static u32 evergreen_gpu_check_soft_reset(struct radeon_device *rdev)
 	if (tmp & L2_BUSY)
 		reset_mask |= RADEON_RESET_VMC;
 
+	/* reset everything if we attempt to reset the MC */
+	if (reset_mask & RADEON_RESET_MC) {
+		dev_info(rdev->dev, "MC busy: 0x%08X, resetting ALL\n", reset_mask);
+		reset_mask = 0xffffffff;
+	}
+
 	return reset_mask;
 }
 
-- 
1.7.7.5


[-- Attachment #3: 0001-drm-radeon-XXX-skip-MC-reset-as-it-s-probably-not-hu.patch --]
[-- Type: text/x-patch, Size: 1138 bytes --]

From 834c26ab02e3581ea97b39a90fc0637e7becfa67 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Thu, 28 Feb 2013 10:03:08 -0500
Subject: [PATCH] drm/radeon: XXX skip MC reset as it's probably not hung

The MC is mostly likely busy (e.g., display requests), not hung
so no need to reset it.  Doing an MC reset is tricky and not
particularly reliable.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/radeon/evergreen.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c
index 3c38ea4..0f15ada 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -2438,6 +2438,12 @@ static u32 evergreen_gpu_check_soft_reset(struct radeon_device *rdev)
 	if (tmp & L2_BUSY)
 		reset_mask |= RADEON_RESET_VMC;
 
+	/* Skip MC reset as it's mostly likely not hung, just busy */
+	if (reset_mask & RADEON_RESET_MC) {
+		dev_info(rdev->dev, "MC busy: 0x%08X, clearing.\n", reset_mask);
+		reset_mask &= ~RADEON_RESET_MC;
+	}
+
 	return reset_mask;
 }
 
-- 
1.7.7.5


  reply	other threads:[~2013-02-28 15:10 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-26  0:05 [git pull] drm merge for 3.9-rc1 Dave Airlie
2013-02-26  1:22 ` Linus Torvalds
2013-02-26  1:59   ` Dave Airlie
2013-02-27  1:39 ` Linus Torvalds
2013-02-27  2:25   ` Linus Torvalds
2013-02-27  3:30   ` Dave Airlie
2013-02-27  3:38     ` Linus Torvalds
2013-02-27 10:04   ` Chris Wilson
2013-03-03 15:39   ` Azat Khuzhin
2013-03-05 19:18   ` Daniel Vetter
2013-02-27 16:34 ` Josh Boyer
2013-02-27 20:20   ` Josh Boyer
2013-02-27 20:24     ` Josh Boyer
2013-02-28  0:01     ` Josh Boyer
2013-02-28  1:14       ` Josh Boyer
2013-02-28 13:38         ` Alex Deucher
2013-02-28 13:44           ` Josh Boyer
2013-02-28 15:09             ` Alex Deucher [this message]
2013-02-28 15:15               ` Josh Boyer
2013-02-28 18:59                 ` Josh Boyer
2013-03-05 15:21                   ` Josh Boyer
2013-03-05 15:48                     ` Alex Deucher
2013-02-27 22:36 Sedat Dilek
2013-02-27 23:06 ` Sedat Dilek
2013-02-28 11:18   ` Chris Wilson
2013-02-28 17:07     ` Sedat Dilek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADnq5_OadSJ1thVEvfQVfTZ3De0vivVjnCKwepVxabs6i6ivLQ@mail.gmail.com \
    --to=alexdeucher@gmail.com \
    --cc=airlied@linux.ie \
    --cc=alexander.deucher@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jglisse@redhat.com \
    --cc=jwboyer@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).