All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@bugzilla.kernel.org
To: dri-devel@lists.freedesktop.org
Subject: [Bug 198669] Driver crash at radeon_ring_backup+0xd3/0x140 [radeon]
Date: Wed, 07 Feb 2018 09:16:42 +0000	[thread overview]
Message-ID: <bug-198669-2300-Q71CXy3HTH@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-198669-2300@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=198669

--- Comment #12 from roger@beardandsandals.co.uk (roger@beardandsandals.co.uk) ---
On 7 February 2018 08:23:06 bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=198669
>
> --- Comment #10 from Christian König (christian.koenig@amd.com) ---
> (In reply to roger@beardandsandals.co.uk from comment #9)
>> The most likely cause of this kind of mechanical issue is the signal path
>> between the video interface hardware and the outside world, either a dry
>> joint or a mechanical fault in the cable or cable connectors.
>
> That is what I absolutely agree about.
>
>> The driver has sufficient
>> information to determine that a hard failure has occured, and that failure
>> is probably not in the gpu itself. I would like to see the driver doing a
>> hard reset of the card with rigorous error checking. If it cannot reset the
>> GPU in graphical mode it should try to set the display hardware into a basic
>> console mode.
>
> And that is the part you don't seem to understand. The driver is trying
> exactly
> what you are describing.
>
> We detect a problem because of a timeout, e.g. the hardware doesn't respond
> in
> a given time frame on commands we send to it.
>
> What we do then is to query the hardware how far we proceeded in the
> execution
> and the hardware answered with a nonsense value. In other words bits are set
> in
> the response which should never be set.
>
> This is a clear indicator that the PCIe transaction for the register read
> aborted because the device doesn't response any more.
>
> The most likely cause of that is that the bus interface in the ASIC locked up
> because of an electrical problem (I think the ESD protection kicked in) and
> the
> only way to get out of that is a hard reset of the system.
>
> What we can try to do is trying to prevent further failures like the crash
> you
> described by checking the values read from the hardware. This way you can at
> least access the box over the network or blindly shut it down with keyboard
> short cuts.


Yes, I take your point. I was speculating on insufficient information. My 
apologies. The solution you propose sounds great.

Thank you for your patience.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  parent reply	other threads:[~2018-02-07  9:16 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-04 17:39 [Bug 198669] New: Driver crash at radeon_ring_backup+0xd3/0x140 [radeon] bugzilla-daemon
2018-02-04 17:41 ` [Bug 198669] " bugzilla-daemon
2018-02-04 18:26 ` bugzilla-daemon
2018-02-04 20:55 ` bugzilla-daemon
2018-02-05 12:16 ` bugzilla-daemon
2018-02-05 22:03 ` bugzilla-daemon
2018-02-06 14:05 ` bugzilla-daemon
2018-02-06 14:12 ` bugzilla-daemon
2018-02-06 15:19 ` bugzilla-daemon
2018-02-06 15:53 ` bugzilla-daemon
2018-02-06 21:39 ` bugzilla-daemon
2018-02-07  8:22 ` bugzilla-daemon
2018-02-07  9:12 ` bugzilla-daemon
2018-02-07  9:16 ` bugzilla-daemon [this message]
2018-02-07 12:45 ` bugzilla-daemon
2018-12-03  3:55 ` bugzilla-daemon
2018-12-03  8:14 ` bugzilla-daemon
2018-12-03  8:18 ` bugzilla-daemon
2018-12-03 11:12 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-198669-2300-Q71CXy3HTH@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@bugzilla.kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.