From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757207AbcEEXz7 (ORCPT ); Thu, 5 May 2016 19:55:59 -0400 Received: from mga11.intel.com ([192.55.52.93]:56125 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755704AbcEEXz6 (ORCPT ); Thu, 5 May 2016 19:55:58 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,584,1455004800"; d="scan'208";a="969730814" From: "Zanoni, Paulo R" To: "stefanr@s5r6.in-berlin.de" CC: "airlied@redhat.com" , "intel-gfx@lists.freedesktop.org" , "linux-kernel@vger.kernel.org" , "dri-devel@lists.freedesktop.org" , "Vetter, Daniel" Subject: Re: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW" Thread-Topic: Regression of v4.6-rc vs. v4.5 bisected: a98ee79317b4 "drm/i915/fbc: enable FBC by default on HSW and BDW" Thread-Index: AQHRpvX9lrDl4AzzVUu8HLsBjaDOhp+rJVSAgABEY4CAABEFAA== Date: Thu, 5 May 2016 23:55:56 +0000 Message-ID: <1462492552.14511.10.camel@intel.com> References: <20160426210008.2f79fcdf@kant> <20160429100741.6be95385@kant> <20160430155154.597829ca@kant> <20160505194506.63b9c113@kant> <1462474211.29701.21.camel@intel.com> <20160506005457.1fa4b4e3@kant> In-Reply-To: <20160506005457.1fa4b4e3@kant> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.252.202.11] Content-Type: text/plain; charset="utf-8" Content-ID: <60F7BFFBD226FD459ADEB361F9AC32E4@intel.com> MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id u45Nu9is013993 Em Sex, 2016-05-06 às 00:54 +0200, Stefan Richter escreveu: > On May 05 Zanoni, Paulo R wrote: > > > > Em Qui, 2016-05-05 às 19:45 +0200, Stefan Richter escreveu: > > > > > >     Oh, and in case you - the person reading this commit message > > > - found > > >     this commit through git bisect, please do the following: > > >      - Check your dmesg and see if there are error messages > > > mentioning > > >        underruns around the time your problem started happening. > > > > > > Well, I always had the followings lines in dmesg: > > > [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared > > > fifo underrun on pipe A > > > [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO > > > underrun   > > Oh, well... I had a patch that would just disable FBC in case we > > saw a > > FIFO underrun, but it was rejected. Maybe this is the time to think > > about it again? Otherwise, I can't think of much besides disabling > > FBC > > on HSW until all the underruns and watermarks regressions are fixed > > forever. > Just to be clear though, I know that these messages are emitted when > the > monitor is switched on, and when sddm is being shut down --- but I do > not > know whether there is any sort of underrun when I get the FBC related > freeze (since I just don't get any kernel messages at that point). The fact that underruns have occurred earlier is enough to know that something is wrong (most probably, bad watermarks): we stop reporting underruns once we get the first one. In addition, we already know that FBC has the tendency to amplify apparently-harmless FIFO underruns into black screens, and I wouldn't be surprised to learn that it could also cause full machine lockups. > > Is there a chance that a serial console would fare better than > netconsole?  This board and another PC in its vicinity have got > onboard > serial ports but I don't have cables at the moment. In the past, for some specific cases not related to FBC, I had more luck with serial console than with netconsole. But if this is really caused by FBC and watermarks, I don't think you'll be able to grab any specific message at the time of the machine hang. OTOH, if something actually shows up, it could help invalidate our current assumption of the relationship between the problem and FBC and underruns. > > > > > > > > >      - Download intel-gpu-tools, compile it, and run: > > >        $ sudo ./tests/kms_frontbuffer_tracking --run-subtest > > > '*fbc-*' 2>&1 | tee fbc.txt   > > >        Then send us the fbc.txt file, especially if you get a > > > failure. > > >        This will really maximize your chances of getting the bug > > > fixed > > >        quickly. > > > > > > Do you need this while FBC is enabled, or can I run it while FBC > > > is > > > disabled?   > > FBC enabled. Considering your description, my hope is that maybe > > some > > specific subtest will be able to hang your machine, so testing this > > again will require only running the specific subtest instead of > > waiting > > 18 hours. > The kms_frontbuffer_tracking runs from which I posted output two > hours > ago did not trigger a lockup. > > (I ran them while X11 was shut down because otherwise > kms_frontbuffer_tracking would skip all tests with "Can't become DRM > master, please check if no other DRM client is running.") Yes, this is the correct way. > > > > > > > > > PS: > > > I am mentioning the following just in case that it has any > > > relationship > > > with the FBC related kernel freezes.  Maybe it doesn't...  There > > > is > > > another recent regression on this PC, but I have not yet figured > > > out > > > whether it was introduced by any particular kernel version.  The > > > regression is:  When switching from X11 to text console by > > > [Ctrl][Alt][Fx] > > > or by shutting down sddm, I often only get a blank screen.  I > > > suspect > > > that this regression was introduced when I replaced kdm by sddm, > > > but > > > I am not sure about that.   > > Maybe there is some relationship, since this operation involves a > > mode > > change. You can also try checking dmesg to see if there are > > underruns > > right when you do the change. > Yes, this is accompanied by > [drm:intel_set_cpu_fifo_underrun_reporting] *ERROR* uncleared fifo > underrun on pipe A > [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO > underrun