All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Osipenko <digetx@gmail.com>
To: Thierry Reding <thierry.reding@gmail.com>
Cc: Mikko Perttunen <cyndis@kapsi.fi>, Kyle Evans <kvans32@gmail.com>,
	linux-tegra@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1] gpu: host1x: Ignore clients initialization failure
Date: Thu, 09 Aug 2018 15:53:03 +0300	[thread overview]
Message-ID: <1672653.y032a1rHIh@dimapc> (raw)
In-Reply-To: <20180809104541.GC21639@ulmo>

On Thursday, 9 August 2018 13:45:41 MSK Thierry Reding wrote:
> On Fri, Aug 03, 2018 at 02:30:47PM +0300, Dmitry Osipenko wrote:
> > On Friday, 3 August 2018 13:50:55 MSK Thierry Reding wrote:
> > > On Wed, Aug 01, 2018 at 06:08:07PM +0300, Dmitry Osipenko wrote:
> > > > From time to time new bugs are popping up, causing some host1x client
> > > > to
> > > > fail its initialization. Currently a single clients initialization
> > > > failure
> > > > causes whole host1x device registration to fail, as a result a single
> > > > DRM
> > > > sub-device initialization failure makes whole DRM initialization to
> > > > fail.
> > > > Let's ignore clients initialization failure, as a result display panel
> > > > lights up even if some DRM clients (say GR2D or VIC) fail to
> > > > initialize.
> > > > 
> > > > Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> > > > ---
> > > > 
> > > >  drivers/gpu/host1x/bus.c | 18 +++++++-----------
> > > >  1 file changed, 7 insertions(+), 11 deletions(-)
> > > 
> > > This is actually done on purpose. I can't think of a case where we would
> > > actively like to keep a half-broken DRM device operational. The errors
> > > that you're talking about should only happen during development,
> > 
> > [only in a perfect world]
> 
> gr2d and VIC are fairly deterministic. What are the errors you are
> seeing that cause initialization failure?

Right now there are no specific errors. There is only a known trouble with the 
ARM_DMA_USE_IOMMU, but that causes to fail all of the clients.

> Note that it is possible for
> these devices to malfunction even if initialization succeeds. A failure
> to initialize can only happen if there's something wrong in the device
> tree, firmware is missing (in case of VIC) or a regression was
> introduced in some subsystem that causes a failure (maybe deferred probe
> or something similar).

A missing firmware is an actual good example. Though can't VIC driver be 
changed to load firmware at the time of a its first use by userspace? It 
should be very annoying that kernel driver forces you to churn with initramfs.

> > > and the
> > > device not showing up is a pretty good indicator that something is wrong
> > > as opposed to everything booting normally and then getting some cryptic
> > > error at runtime because one of the clients didn't initialize.
> > 
> > If machine doesn't have a serial port and it doesn't have ssh server
> > running or network doesn't come up, you'll end up with a completely
> > dysfunctional piece of hardware. Hence there is no chance for you to even
> > check what is wrong. The argument about a cryptic error doesn't make much
> > sense, you have to improve your runtime as well (and you'll get a error
> > message in the kernels log).
> 
> You make a good point here. I'm not aware of any devices we support in
> the kernel that don't have a serial console, but that doesn't mean the
> kernel could be used on an "unsupported" device  that doesn't have one.

AFAIK, enabling serial on AC100 require soldering iron.

> > > From my perspective, all clients should always be operational in
> > > whatever baseline version you use. If it isn't that's a bug that should
> > > be fixed. Ideally those bugs should get fixed before making it into a
> > > baseline version (mainline, linux-next, ...), so that this never impacts
> > > even developers, unless they break it themselves, in which case refusing
> > > to register the DRM device is a pretty good incentive to fix it.
> > 
> > Sounds like you're assuming that only kernel developers are supposed to
> > use
> > Tegra, though at least (for now) for upstream it is kinda true. Of course
> > that could be changed ;-)
> 
> That's not at all what I'm saying. What I am saying is that we should
> make it difficult for developers to break the kernel in a way that would
> put users into a position that is difficult to get themselves out of. If
> we refuse to register the complete DRM device in case some subdevice
> fails to initialize we increase the chances of developers noticing and
> fixing it.
> 
> If all we do on failure is output an error message, it's very likely to
> go unnoticed. Developers are likely to check specifically the
> functionality that they're working on and ignore failures that they may
> have caused in other parts of the code as a side-effect of their current
> work.

I can try to help with improving of your automated testing suite, if you'll 
make it accessible to the public.

> Perhaps a good middle ground would be to turn initialization failures
> into WARN_ON() to increase the chances of them getting notified and then
> continue on a best effort basis in the hopes of having enough
> initialized to get a message to users.

Good to me, I'll make v2.

> Another alternative would be to
> mark essential subdevices (such as the display controller) so that only
> they will cause failure to register the whole DRM device.

I don't think that making some subdevice more valuable than the others is a 
feasible approach, how you are going to differentiate what of the two display 
controllers is more important than the other.

  reply	other threads:[~2018-08-09 12:53 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-01 15:08 [PATCH v1] gpu: host1x: Ignore clients initialization failure Dmitry Osipenko
2018-08-01 15:08 ` Dmitry Osipenko
2018-08-03 10:50 ` Thierry Reding
2018-08-03 10:50   ` Thierry Reding
2018-08-03 11:30   ` Dmitry Osipenko
2018-08-09 10:45     ` Thierry Reding
2018-08-09 12:53       ` Dmitry Osipenko [this message]
2018-08-09 13:10         ` Thierry Reding
2018-08-09 13:46           ` Dmitry Osipenko
2018-08-09 14:15             ` Thierry Reding
2018-08-09 14:15               ` Thierry Reding

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1672653.y032a1rHIh@dimapc \
    --to=digetx@gmail.com \
    --cc=cyndis@kapsi.fi \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=kvans32@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=thierry.reding@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.