linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rob Herring <robh+dt@kernel.org>
To: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Jeffrey Hugo <jeffrey.l.hugo@gmail.com>,
	Doug Anderson <dianders@chromium.org>,
	freedreno <freedreno@lists.freedesktop.org>,
	"open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS" 
	<devicetree@vger.kernel.org>, David Airlie <airlied@linux.ie>,
	linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	Rob Clark <robdclark@gmail.com>, Andy Gross <agross@kernel.org>,
	Daniel Vetter <daniel@ffwll.ch>, Sean Paul <sean@poorly.run>
Subject: Re: [Freedreno] [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting
Date: Wed, 15 Jul 2020 11:30:29 -0600	[thread overview]
Message-ID: <CAL_JsqKU7X5f-Dt67N_yFQUUVuXrZYomB-n=dKyUqD94OXFyJQ@mail.gmail.com> (raw)
In-Reply-To: <20200714225220.GI388985@builder.lan>

On Tue, Jul 14, 2020 at 4:54 PM Bjorn Andersson
<bjorn.andersson@linaro.org> wrote:
>
> On Tue 14 Jul 15:13 PDT 2020, Rob Herring wrote:
>
> > On Tue, Jul 14, 2020 at 10:33 AM Jeffrey Hugo <jeffrey.l.hugo@gmail.com> wrote:
> > >
> > > On Mon, Jul 13, 2020 at 5:50 PM Doug Anderson <dianders@chromium.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Mon, Jul 13, 2020 at 1:25 PM Rob Herring <robh+dt@kernel.org> wrote:
> > > > >
> > > > > On Mon, Jul 13, 2020 at 9:08 AM Doug Anderson <dianders@chromium.org> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Mon, Jul 13, 2020 at 7:11 AM Rob Herring <robh+dt@kernel.org> wrote:
> > > > > > >
> > > > > > > On Fri, Jul 10, 2020 at 5:02 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > > > > > >
> > > > > > > > I found that if I ever had a little mistake in my kernel config,
> > > > > > > > or device tree, or graphics driver that my system would sit in a loop
> > > > > > > > at bootup trying again and again and again.  An example log was:
> > > > > > >
> > > > > > > Why do we care about optimizing the error case?
> > > > > >
> > > > > > It actually results in a _fully_ infinite loop.  That is: if anything
> > > > > > small causes a component of DRM to fail to probe then the whole system
> > > > > > doesn't boot because it just loops trying to probe over and over
> > > > > > again.  The messages I put in the commit message are printed over and
> > > > > > over and over again.
> > > > >
> > > > > Sounds like a bug as that's not what should happen.
> > > > >
> > > > > If you defer during boot (initcalls), then you'll be on the deferred
> > > > > list until late_initcall and everything is retried. After
> > > > > late_initcall, only devices getting added should trigger probing. But
> > > > > maybe the adding and then removing a device is causing a re-trigger.
> > > >
> > > > Right, I'm nearly certain that the adding and then removing is causing
> > > > a re-trigger.  I believe the loop would happen for any case where we
> > > > have a probe function that:
> > > >
> > > > 1. Adds devices.
> > > > 2. After adding devices it decides that it needs to defer.
> > > > 3. Removes the devices it added.
> > > > 4. Return -EPROBE_DEFER from its probe function.
> > > >
> > > > Specifically from what I know about how -EPROBE_DEFER works I'm not
> > > > sure how it wouldn't cause an infinite loop in that case.
> > > >
> > > > Perhaps the missing part of my explanation, though, is why it never
> > > > gets out of this infinite loop.  In my case I purposely made the
> > > > bridge chip "ti-sn65dsi86.c" return an error (-EINVAL) in its probe
> > > > every time.  Obviously I wasn't going to get a display up like this,
> > > > but I just wanted to not loop forever at bootup.  I tracked down
> > > > exactly why we get an - EPROBE_DEFER over and over in this case.
> > > >
> > > > You can see it in msm_dsi_host_register().  If some components haven't
> > > > shown up when that function runs it will _always_ return
> > > > -EPROBE_DEFER.
> > > >
> > > > In my case, since I caused the bridge to fail to probe, those
> > > > components will _never_ show up.  That means that
> > > > msm_dsi_host_register() will _always_ return -EPROBE_DEFER.
> > > >
> > > > I haven't dug through all the DRM code enough, but it doesn't
> > > > necessarily seem like the wrong behavior.  If the bridge driver or a
> > > > panel was a module then (presumably) they could show up later and so
> > > > it should be OK for it to defer, right?
> > > >
> > > > So with all that, it doesn't really feel like this is a bug so much as
> > > > it's an unsupported use case.  The current deferral logic simply can't
> > > > handle the case we're throwing at it.  You cannot return -EPROBE_DEFER
> > > > if your probe function adds devices each time through the probe
> > > > function.
> > > >
> > > > Assuming all the above makes sense, that means we're stuck with:
> > > >
> > > > a) This patch series, which makes us not add devices.
> > > >
> > > > b) Some other patch series which rearchitects the MSM graphics stack
> > > > to not return -EPROBE_DEFER in this case.
> > >
> > > This isn't a MSM specific issue.  This is an issue with how the DSI
> > > interface works, and how software is structured in Linux.  I would
> > > expect that pretty much any DSI host in the kernel would have some
> > > version of this issue.
> > >
> > > The problem is that DSI is not "hot pluggable", so to give the DRM
> > > stack the info it needs, we need both the DSI controller (aka the MSM
> > > graphics stack in your case), and the thing it connects to (in your
> > > case, the TI bridge, normally the actual panel) because the DRM stack
> > > expects that if init completes, it has certain information
> > > (resolution, etc), and some of that information is in the DSI
> > > controller, and some of it is on the DSI device.
> >
> > Ah yes, DRM's lack of hot-plug and discrete component support... Is
> > that not improved with some of the bridge rework?
> >
> > Anyways, given there is a child dependency on the parent, I don't
> > think we should work-around DRM deficiencies in DT.
> >
> > BTW, There's also a deferred probe timeout you can use which stops
> > deferring probe some number of seconds after late_initcall.
> >
>
> I don't think we can rely on the deferred probe timeout, given that it
> was reverted back to 0 seconds past late_initcall - which given that
> most of the involved components are modules, means that without the
> opt-in command line option we would likely fail to bring up the display.

I meant just as a way to make progress booting in this case where the
display is never coming up. We're talking only about a better
experience for an error case.

Maybe a simple solution is just having some delay inserted between
delayed probe triggers so progress is made.

Rob

      reply	other threads:[~2020-07-15 17:30 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-10 23:02 [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting Douglas Anderson
2020-07-10 23:02 ` [PATCH 1/9] drm/msm: Use the devm variant of of_platform_populate() Douglas Anderson
2020-07-10 23:02 ` [PATCH 2/9] dt-bindings: msm/dpu: Add simple-bus to dpu bindings Douglas Anderson
2020-07-10 23:02 ` [PATCH 3/9] dt-bindings: msm/mdp5: " Douglas Anderson
2020-07-10 23:02 ` [PATCH 4/9] drm/msm: Avoid manually populating our children if "simple-bus" is there Douglas Anderson
2020-07-10 23:02 ` [PATCH 5/9] arm64: dts: qcom: sc7180: Add "simple-bus" to our mdss node Douglas Anderson
2020-07-10 23:02 ` [PATCH 6/9] arm64: dts: qcom: sdm845: " Douglas Anderson
2020-07-10 23:02 ` [PATCH 7/9] arm64: dts: qcom: msm8916: " Douglas Anderson
2020-07-10 23:02 ` [PATCH 8/9] arm64: dts: qcom: msm8996: " Douglas Anderson
2020-07-10 23:02 ` [PATCH 9/9] ARM: dts: qcom: msm8974: " Douglas Anderson
2020-07-13 14:11 ` [PATCH 0/9] drm/msm: Avoid possible infinite probe deferral and speed booting Rob Herring
2020-07-13 14:58   ` [Freedreno] " Jeffrey Hugo
2020-07-13 15:08   ` Doug Anderson
2020-07-13 20:25     ` Rob Herring
2020-07-13 21:32       ` Rob Clark
2020-07-13 23:50       ` Doug Anderson
2020-07-14 16:32         ` [Freedreno] " Jeffrey Hugo
2020-07-14 22:13           ` Rob Herring
2020-07-14 22:52             ` Bjorn Andersson
2020-07-15 17:30               ` Rob Herring [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAL_JsqKU7X5f-Dt67N_yFQUUVuXrZYomB-n=dKyUqD94OXFyJQ@mail.gmail.com' \
    --to=robh+dt@kernel.org \
    --cc=agross@kernel.org \
    --cc=airlied@linux.ie \
    --cc=bjorn.andersson@linaro.org \
    --cc=daniel@ffwll.ch \
    --cc=devicetree@vger.kernel.org \
    --cc=dianders@chromium.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=freedreno@lists.freedesktop.org \
    --cc=jeffrey.l.hugo@gmail.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robdclark@gmail.com \
    --cc=sean@poorly.run \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).