From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Vetter Subject: Re: [PATCH 00/81] modeset rework Date: Sun, 15 Jul 2012 16:57:33 +0200 Message-ID: <20120715145733.GB5184@phenom.ffwll.local> References: <1342016944-23395-1-git-send-email-daniel.vetter@ffwll.ch> <1342358035_1196@CP5-2952> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail-wi0-f177.google.com (mail-wi0-f177.google.com [209.85.212.177]) by gabe.freedesktop.org (Postfix) with ESMTP id 048F99E752 for ; Sun, 15 Jul 2012 07:57:29 -0700 (PDT) Received: by wibhm11 with SMTP id hm11so1530477wib.12 for ; Sun, 15 Jul 2012 07:57:29 -0700 (PDT) Content-Disposition: inline In-Reply-To: <1342358035_1196@CP5-2952> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+gcfxdi-intel-gfx=m.gmane.org@lists.freedesktop.org To: Chris Wilson Cc: Daniel Vetter , Intel Graphics Development List-Id: intel-gfx@lists.freedesktop.org On Sun, Jul 15, 2012 at 02:13:41PM +0100, Chris Wilson wrote: > On Wed, 11 Jul 2012 16:27:43 +0200, Daniel Vetter wrote: > > Or: How to fix cpu edp in 81 simple steps. Admittedly this includes some minor > > detours. > > Let me quickly jump in with my demands: > > That modeset should detect failure and propagate that back to the user. > > The modesetting code has outgrown its origins of setting a couple of > registers and be done, as we are now in the domain of fiddly link > training and end-point negotations. So it is fragile, can conceivably > fail and so the errors need to be reported rather than blindly assume > the output is now working. > > So all of these nice new WARNs I'm seeing need to return error codes as > well. Tbh failure propagation hasn't been front-and-center when I've done this rework, but I've certainly noticed that we currenlty suck at this. Hence nothing too coherent, but just a few thoughts: - I think the WARNs should stay as-is. With the new code I only call them once the modeset has completely, or at least once our code thinks it has completed. Imo it's a really good tool to catch state-tracking goof-ups or modeset sequence ordering issues, not more. I.e. all the WARNs I've hit cleared indicated an issue in the code itself, and usually where the harbingers of this pesky permanent-black-screen-till-reboot issue. - The current error code handling inherited from the crtc helpers is rather lacking, i.e. I wondered whether I should smash a quick patch on top of this series to make set_base interruptible. And then noticed that we only return a bool in set_base (and many other such places). So even improving this for some simple&know cases will be quite some work. - I think we should also differentiate between the different reasons why a modeset can fail: From modes that the output doesn't support, failure in pinning the fb because the gpu died, global resource allocation issues (e.g. lack of pll, fdi links, ...) to the hw actually doing something wrong. Current set_mode ioctl is restricted to -ERRNO, but I wonder whether we should add something more flexible for the global modeset code (similar to how we already have tons of reasons to reject a mode in the ->mode_valid functions). - For global resource allocation and reservation (get_pch, pin_and_fence ...) I think we should move this kind of stuff out of the crtc/encoder callbacks into a prepare step (maybe merged together with the ->mode_fixup step). That way the actual modeset code should only fail if the hw is in a non-cooperative mood. This is also a necessary step to implement Jesse's idea of a modeset configuration checker. If we'd add some drm modeset specific error codes the mode configuration selection tool could even show nice tooltips on greyed-out-modes, explaning why the mode doesn't work (e.g. "not enough memory bandwidth", "not enough plls", or some generic "not enough resources" error ...). Yeah, I can have dreams ... - After all that, we'd be left with with the true hw issues like link training failures or stuck pipes. Judging by bug reports and own experience, once we're in such a state, we're stuck in such a state and re-doing the modeset doesn't help too much. We already try to restore the old mode, but if that also fails I'm not too sure what userspace should do with the resulting -EIO. At least for the short-term I think that our modeset code will profits the most if we exploit the simplified state machine of the new code and cut down on the statespace we're handling in the code. Like this patch series already does for DP. Also, all these WARNs should help in improving our modeset experience indirectly, through more bug reports and their fixes ;-) Long term I agree that we need robuster error handling, but I think handling of hw error should be about the last item: Ime we tend to keep on flailing once we start, and I expect the pay-off in code quality and robustness of the previous items to be better. In any case, this is all stuff that should sit on top of this modeset rework. Cc'ing Ville, maybe he has some further ideas from his global modeset work. Cheers, Daniel -- Daniel Vetter Mail: daniel@ffwll.ch Mobile: +41 (0)79 365 57 48