All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Chromium - Application-level nouveau blacklist
@ 2019-01-06  3:36 K. York
       [not found] ` <CABeNrKXkSyyETKH2gz_6-L+F1ptWHqCZ3eg=Cwjpu73RodFSuA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: K. York @ 2019-01-06  3:36 UTC (permalink / raw)
  To: imirkin-FrUbXkNCsVf2fBVCVOL8/A; +Cc: nouveau-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW

> Opinions welcome.

I have a few ideas on the best way to approach this.

 - First of all, obviously, fix the WebGL CTS problems. (with
--ignore-gpu-blacklist )
 - Fix all other crashing issues and request re-inclusion. (This is
comment #37.) Chrome versions are only 6 weeks, so not too bad of a
lead time.
 - Set up fuzz testing to discover new crashing and stability issues
before they impact userspace. This will also help with discovering the
crash issues that need to get fixed. Chromium will probably loan
ClusterFuzz resources to help with this.
 - Set up monkey testing with Chrome on Nouveau to discover the causes
of the black-rectangle bugs.
 - Set up as-rendered diff testing between major Nouveau versions vs
HEAD and require review of differences before a release can be made.
(see https://fifoci.dolphin-emu.org/about/ for prior art)
 - Set up diff testing with the propietary NVIDIA drivers. (This may
cost a significant amount of money to do.)

All of this requires developer time and effort to do. You might need
to organize a call for volunteers from a wider audience than just the
nouveau mailing list.

Best of luck,
~Kane
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Chromium - Application-level nouveau blacklist
       [not found] ` <CABeNrKXkSyyETKH2gz_6-L+F1ptWHqCZ3eg=Cwjpu73RodFSuA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2019-01-06  4:05   ` Ilia Mirkin
  0 siblings, 0 replies; 7+ messages in thread
From: Ilia Mirkin @ 2019-01-06  4:05 UTC (permalink / raw)
  To: K. York; +Cc: nouveau

On Sat, Jan 5, 2019 at 10:36 PM K. York <kanepyork@gmail.com> wrote:
>
> > Opinions welcome.
>
> I have a few ideas on the best way to approach this.
>
>  - First of all, obviously, fix the WebGL CTS problems. (with
> --ignore-gpu-blacklist )
>  - Fix all other crashing issues and request re-inclusion. (This is
> comment #37.) Chrome versions are only 6 weeks, so not too bad of a
> lead time.
>  - Set up fuzz testing to discover new crashing and stability issues
> before they impact userspace. This will also help with discovering the
> crash issues that need to get fixed. Chromium will probably loan
> ClusterFuzz resources to help with this.
>  - Set up monkey testing with Chrome on Nouveau to discover the causes
> of the black-rectangle bugs.
>  - Set up as-rendered diff testing between major Nouveau versions vs
> HEAD and require review of differences before a release can be made.
> (see https://fifoci.dolphin-emu.org/about/ for prior art)
>  - Set up diff testing with the propietary NVIDIA drivers. (This may
> cost a significant amount of money to do.)
>
> All of this requires developer time and effort to do. You might need
> to organize a call for volunteers from a wider audience than just the
> nouveau mailing list.

Thanks for your feedback. Following such steps would surely lead to a
much higher quality driver than what we have today. As you're aware,
what you're suggesting requires an IMMENSE investment of effort, and
holds the nouveau driver to a considerably higher standard than any of
the other drivers, including the NVIDIA proprietary driver. For the
driver that's backed by the fewest resources of almost any of the
other driver efforts, I don't think that's reasonable.

Here's the thing -- the only thing that outright dies for me right now
is the max-texture-size thing + what feel like browser bugs. The WebGL
CTS is relatively new, and not well battle-tested, so achieving a 100%
pass rate will involve fixing a lot of their tests (in fact I've
already fixed some of them). And even if I come back with clean
results, that will still be 1 test result from a matrix of (GPU SKU,
Mesa version, Kernel version, Other Factors) combinations which
probably numbers in the millions.

I've glanced at the HN discussion about this situation
(https://news.ycombinator.com/item?id=18834715), and it does seem like
people are focusing on the wrong thing... the important bit isn't that
nouveau crashes and burns in some situations -- everyone already knew
that, including the users of nouveau who continue to use it
nonetheless. It's that if every piece of software feels free to ignore
a system integrator's or user's wishes, then the user now has to know
how to override that behaviour separately in every application. The
situation is that Distro X has decided that nouveau is the right thing
for its users. A user can disable that by uninstalling or otherwise
disabling nouveau if they wish. But now chrome comes along with its
own set of rules. What if every application starts doing that?

It should also be noted that outside of a few pathological cases, like
creating 2GB+ textures which never happens in practice, nouveau works
just fine for me. For other people, it dies at random intervals,
irrespective of whether they're using chrome or not. While this is a
non-ideal scenario, chrome shouldn't be in the business of worrying
about things like that. It just confuses the situation for everyone.

Cheers,

  -ilia
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Chromium - Application-level nouveau blacklist
  2019-01-06  7:37 ` Jason Ekstrand
  2019-01-07  7:06   ` Tapani Pälli
@ 2019-01-08  6:56   ` Stéphane Marchesin
  1 sibling, 0 replies; 7+ messages in thread
From: Stéphane Marchesin @ 2019-01-08  6:56 UTC (permalink / raw)
  To: Jason Ekstrand; +Cc: nouveau, ML Mesa-dev

On Sat, Jan 5, 2019 at 11:37 PM Jason Ekstrand <jason@jlekstrand.net> wrote:
>
> On Sat, Jan 5, 2019 at 2:40 PM Ilia Mirkin <imirkin@alum.mit.edu> wrote:
>>
>> It looks like as of Chromium 71, nouveau is completely blacklisted.
>
>
> That's rather unfortunate. :-(  The intel mesa drivers were also blacklisted for quite some time a while back.  I'm not really sure what we did to get blacklisted or what we did to get unblacklisted.
>

One major difference is that we have shipped Chromebooks with
intel-based GPUs for ~8 years, so we (collectively, intel and Chrome
OS folks) have fixed the long tail of Chrome bugs for Chrome OS +
Intel, and Linux benefited as a side effect.


>>
>> I don't really see a way back from this, since they don't cite any
>> easily reproducible issues, except that some people had some issues
>> with indeterminate hardware and indeterminate versions of mesa.
>>
>> In the bug that triggered this
>> (https://bugs.chromium.org/p/chromium/issues/detail?id=876523), where
>> I might have slightly lost my cool, they (at the end) suggested that
>> we try to make nouveau a first-class citizen with chromium. However I
>> will never be able to present concrete evidence that inconcrete issues
>> are resolved. I did run the WebGL CTS suite, but that resulted in some
>> hangs from the the max-texture-size-equivalent test, and some
>> browser-level weirdness after some tests where later tests all fail
>> (due to what I have to assume is a browser bug). I don't think I
>> managed to properly track down the true reason why. I didn't want to
>> reach out to them with such results, as that's just further evidence
>> of nouveau not working perfectly.
>
>
> If you want concrete bugs to fix, I highly recommend OpenGL[ES] conformance tests, dEQP, and the WebGL CTS (which is mostly a re-hash of the OpenGL ES 3.0 CTS).  Google cares quite a bit about driver conformance and are much more likely to consider nouveau to be high-quality if those test suites are in good shape.  Years of experience dealing with Google says that dEQP results speak much louder than philosophical arguments about who should decide whether or not Chromium should accept the distro GL.  Fortunately for you, the well funded driver teams (Intel and AMD) have already done a lot of the painful work of getting a lot of the bugs and "bugs" out of core mesa and galium.  What's left are likely real back-end driver bugs which may be affecting some user somewhere so they're worth fixing.

The cause of this blacklist is not (lack of) deqp conformance, but
instead mostly automated crash reports. In other words, crashes in the
field where we have a backtrace but not necessarily a good repro case.
For someone building an application like Chrome, the multitude of
kernel+user space drivers+OS version+compositor combinations basically
makes each bug a very, very long investigation. I argued a long time
ago that we should try to get more communication going between Chrome
folks and Linux GPU driver folks to fix this, but quickly realized
that the task at hand is huge. You can only make a dent in it by being
very systematic about it. If someone wants to commit the time to do
that, I would be happy to help communication around these efforts.


>
>>
>> In the meanwhile, end users are losing accelerated WebGL which in
>> practice worked just fine (at least in my usage of it), and probably
>> some other functionality.
>>
>> One idea is to flip GL_VENDOR to some random string if chromium is
>> running. I don't like this idea, but I also don't have any great
>> alternatives. We can also just take this, as yet-another nail in the
>> nouveau coffin.
>
>
> You asked for opinions, so here you go. :-P  In my personal (and rather disinterested) opinion, I would recommend against such measures.  The last thing anyone needs is an arms race between nouveau and Chromium teams.  I think the better short-term thing to do would be to provide some documentation about WebGL and educate users about Chromium's --ignore-gpu-blacklist option.  This documentation could go on the mesa website or, likely more usefully, it could go in various distro wiki entries about nouveau and/or general nvidia issues.  In the long term, what's needed is improving nouveau quality and stability and re-building trust with the Chromium team.  I'm not trying to attack nouveau here but the fact is that trust has been lost due to an unfortunate history of mis-filed (against Chromium) bugs.  That trust doesn't get re-built by nuclear solutions.


Yes I think the Chrome-side is very simple here: because there isn't
time or means for in-depth investigation, if a driver crashes too
much, it gets blacklisted. The situation is not unique, the GPU
blacklist file is 1700 lines:
https://chromium.googlesource.com/chromium/src/gpu/+/master/config/software_rendering_list.json

Anyway, IMO if the biggest crashers can be fixed, I think we could
eventually make a case to reenable.

Stéphane

>
> --Jason
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Chromium - Application-level nouveau blacklist
  2019-01-06  7:37 ` Jason Ekstrand
@ 2019-01-07  7:06   ` Tapani Pälli
  2019-01-08  6:56   ` Stéphane Marchesin
  1 sibling, 0 replies; 7+ messages in thread
From: Tapani Pälli @ 2019-01-07  7:06 UTC (permalink / raw)
  To: Jason Ekstrand, Ilia Mirkin; +Cc: nouveau, ML Mesa-dev



On 1/6/19 9:37 AM, Jason Ekstrand wrote:
> On Sat, Jan 5, 2019 at 2:40 PM Ilia Mirkin <imirkin@alum.mit.edu 
> <mailto:imirkin@alum.mit.edu>> wrote:
> 
>     It looks like as of Chromium 71, nouveau is completely blacklisted.
> 
> 
> That's rather unfortunate. :-(  The intel mesa drivers were also 
> blacklisted for quite some time a while back.  I'm not really sure what 
> we did to get blacklisted or what we did to get unblacklisted.

We had lots of GPU hangs from WebGL tests. We fixed things until in some 
point things were passing and our web team sent a patch to Chromium to 
enable it back again. This is probably the best route to get Nouveau 
enabled as well.

Have to note that we do have currently some WebGL issues on i965 too .. 
should take a look at some point.


>     I don't really see a way back from this, since they don't cite any
>     easily reproducible issues, except that some people had some issues
>     with indeterminate hardware and indeterminate versions of mesa.
> 
>     In the bug that triggered this
>     (https://bugs.chromium.org/p/chromium/issues/detail?id=876523), where
>     I might have slightly lost my cool, they (at the end) suggested that
>     we try to make nouveau a first-class citizen with chromium. However I
>     will never be able to present concrete evidence that inconcrete issues
>     are resolved. I did run the WebGL CTS suite, but that resulted in some
>     hangs from the the max-texture-size-equivalent test, and some
>     browser-level weirdness after some tests where later tests all fail
>     (due to what I have to assume is a browser bug). I don't think I
>     managed to properly track down the true reason why. I didn't want to
>     reach out to them with such results, as that's just further evidence
>     of nouveau not working perfectly.
> 
> 
> If you want concrete bugs to fix, I highly recommend OpenGL[ES] 
> conformance tests, dEQP, and the WebGL CTS (which is mostly a re-hash of 
> the OpenGL ES 3.0 CTS).  Google cares quite a bit about driver 
> conformance and are much more likely to consider nouveau to be 
> high-quality if those test suites are in good shape.  Years of 
> experience dealing with Google says that dEQP results speak much louder 
> than philosophical arguments about who should decide whether or not 
> Chromium should accept the distro GL.  Fortunately for you, the well 
> funded driver teams (Intel and AMD) have already done a lot of the 
> painful work of getting a lot of the bugs and "bugs" out of core mesa 
> and galium.  What's left are likely real back-end driver bugs which may 
> be affecting some user somewhere so they're worth fixing.
> 
>     In the meanwhile, end users are losing accelerated WebGL which in
>     practice worked just fine (at least in my usage of it), and probably
>     some other functionality.
> 
>     One idea is to flip GL_VENDOR to some random string if chromium is
>     running. I don't like this idea, but I also don't have any great
>     alternatives. We can also just take this, as yet-another nail in the
>     nouveau coffin.
> 
> 
> You asked for opinions, so here you go. :-P  In my personal (and rather 
> disinterested) opinion, I would recommend against such measures.  The 
> last thing anyone needs is an arms race between nouveau and Chromium 
> teams.  I think the better short-term thing to do would be to provide 
> some documentation about WebGL and educate users about Chromium's 
> --ignore-gpu-blacklist option.  This documentation could go on the mesa 
> website or, likely more usefully, it could go in various distro wiki 
> entries about nouveau and/or general nvidia issues.  In the long term, 
> what's needed is improving nouveau quality and stability and re-building 
> trust with the Chromium team.  I'm not trying to attack nouveau here but 
> the fact is that trust has been lost due to an unfortunate history of 
> mis-filed (against Chromium) bugs.  That trust doesn't get re-built by 
> nuclear solutions.
> 
> --Jason
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Chromium - Application-level nouveau blacklist
  2019-01-05 20:40 Ilia Mirkin
  2019-01-06  7:37 ` Jason Ekstrand
@ 2019-01-06 13:52 ` Rob Clark
  1 sibling, 0 replies; 7+ messages in thread
From: Rob Clark @ 2019-01-06 13:52 UTC (permalink / raw)
  To: Ilia Mirkin; +Cc: nouveau, ML Mesa-dev

On Sat, Jan 5, 2019 at 3:40 PM Ilia Mirkin <imirkin@alum.mit.edu> wrote:
>
> It looks like as of Chromium 71, nouveau is completely blacklisted.
>
> I don't really see a way back from this, since they don't cite any
> easily reproducible issues, except that some people had some issues
> with indeterminate hardware and indeterminate versions of mesa.
>
> In the bug that triggered this
> (https://bugs.chromium.org/p/chromium/issues/detail?id=876523), where
> I might have slightly lost my cool, they (at the end) suggested that
> we try to make nouveau a first-class citizen with chromium. However I
> will never be able to present concrete evidence that inconcrete issues
> are resolved. I did run the WebGL CTS suite, but that resulted in some
> hangs from the the max-texture-size-equivalent test, and some
> browser-level weirdness after some tests where later tests all fail
> (due to what I have to assume is a browser bug). I don't think I
> managed to properly track down the true reason why. I didn't want to
> reach out to them with such results, as that's just further evidence
> of nouveau not working perfectly.
>
> In the meanwhile, end users are losing accelerated WebGL which in
> practice worked just fine (at least in my usage of it), and probably
> some other functionality.
>
> One idea is to flip GL_VENDOR to some random string if chromium is
> running. I don't like this idea, but I also don't have any great
> alternatives. We can also just take this, as yet-another nail in the
> nouveau coffin.
>

I think this would be a really bad idea

Better idea might be to request chromium to whitelist nouveau for
pairs of nv generation + mesa version that are known to pass (or at
least comes reasonably close to passing?) WebGL CTS.  Maybe setup a
wiki page or trello or bz or whatever w/ some pointers to info about
how to disable gpu blacklist (to run the cts tests in the first place)
and how to run cts, and table of nv generations.  I guess you don't
have hw or time to test everything yourself, but this is something
that distros and users can help with.  The idea for
wiki/trello/whatever was to help coordinate that and track open bugs
for failing CTS tests.


BR,
-R
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Chromium - Application-level nouveau blacklist
  2019-01-05 20:40 Ilia Mirkin
@ 2019-01-06  7:37 ` Jason Ekstrand
  2019-01-07  7:06   ` Tapani Pälli
  2019-01-08  6:56   ` Stéphane Marchesin
  2019-01-06 13:52 ` Rob Clark
  1 sibling, 2 replies; 7+ messages in thread
From: Jason Ekstrand @ 2019-01-06  7:37 UTC (permalink / raw)
  To: Ilia Mirkin; +Cc: nouveau, ML Mesa-dev


[-- Attachment #1.1: Type: text/plain, Size: 3407 bytes --]

On Sat, Jan 5, 2019 at 2:40 PM Ilia Mirkin <imirkin@alum.mit.edu> wrote:

> It looks like as of Chromium 71, nouveau is completely blacklisted.
>

That's rather unfortunate. :-(  The intel mesa drivers were also
blacklisted for quite some time a while back.  I'm not really sure what we
did to get blacklisted or what we did to get unblacklisted.


> I don't really see a way back from this, since they don't cite any
> easily reproducible issues, except that some people had some issues
> with indeterminate hardware and indeterminate versions of mesa.
>
> In the bug that triggered this
> (https://bugs.chromium.org/p/chromium/issues/detail?id=876523), where
> I might have slightly lost my cool, they (at the end) suggested that
> we try to make nouveau a first-class citizen with chromium. However I
> will never be able to present concrete evidence that inconcrete issues
> are resolved. I did run the WebGL CTS suite, but that resulted in some
> hangs from the the max-texture-size-equivalent test, and some
> browser-level weirdness after some tests where later tests all fail
> (due to what I have to assume is a browser bug). I don't think I
> managed to properly track down the true reason why. I didn't want to
> reach out to them with such results, as that's just further evidence
> of nouveau not working perfectly.
>

If you want concrete bugs to fix, I highly recommend OpenGL[ES] conformance
tests, dEQP, and the WebGL CTS (which is mostly a re-hash of the OpenGL ES
3.0 CTS).  Google cares quite a bit about driver conformance and are much
more likely to consider nouveau to be high-quality if those test suites are
in good shape.  Years of experience dealing with Google says that dEQP
results speak much louder than philosophical arguments about who should
decide whether or not Chromium should accept the distro GL.  Fortunately
for you, the well funded driver teams (Intel and AMD) have already done a
lot of the painful work of getting a lot of the bugs and "bugs" out of core
mesa and galium.  What's left are likely real back-end driver bugs which
may be affecting some user somewhere so they're worth fixing.


> In the meanwhile, end users are losing accelerated WebGL which in
> practice worked just fine (at least in my usage of it), and probably
> some other functionality.
>
> One idea is to flip GL_VENDOR to some random string if chromium is
> running. I don't like this idea, but I also don't have any great
> alternatives. We can also just take this, as yet-another nail in the
> nouveau coffin.
>

You asked for opinions, so here you go. :-P  In my personal (and rather
disinterested) opinion, I would recommend against such measures.  The last
thing anyone needs is an arms race between nouveau and Chromium teams.  I
think the better short-term thing to do would be to provide some
documentation about WebGL and educate users about Chromium's
--ignore-gpu-blacklist option.  This documentation could go on the mesa
website or, likely more usefully, it could go in various distro wiki
entries about nouveau and/or general nvidia issues.  In the long term,
what's needed is improving nouveau quality and stability and re-building
trust with the Chromium team.  I'm not trying to attack nouveau here but
the fact is that trust has been lost due to an unfortunate history of
mis-filed (against Chromium) bugs.  That trust doesn't get re-built by
nuclear solutions.

--Jason

[-- Attachment #1.2: Type: text/html, Size: 4324 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Chromium - Application-level nouveau blacklist
@ 2019-01-05 20:40 Ilia Mirkin
  2019-01-06  7:37 ` Jason Ekstrand
  2019-01-06 13:52 ` Rob Clark
  0 siblings, 2 replies; 7+ messages in thread
From: Ilia Mirkin @ 2019-01-05 20:40 UTC (permalink / raw)
  To: nouveau; +Cc: ML Mesa-dev

It looks like as of Chromium 71, nouveau is completely blacklisted.

I don't really see a way back from this, since they don't cite any
easily reproducible issues, except that some people had some issues
with indeterminate hardware and indeterminate versions of mesa.

In the bug that triggered this
(https://bugs.chromium.org/p/chromium/issues/detail?id=876523), where
I might have slightly lost my cool, they (at the end) suggested that
we try to make nouveau a first-class citizen with chromium. However I
will never be able to present concrete evidence that inconcrete issues
are resolved. I did run the WebGL CTS suite, but that resulted in some
hangs from the the max-texture-size-equivalent test, and some
browser-level weirdness after some tests where later tests all fail
(due to what I have to assume is a browser bug). I don't think I
managed to properly track down the true reason why. I didn't want to
reach out to them with such results, as that's just further evidence
of nouveau not working perfectly.

In the meanwhile, end users are losing accelerated WebGL which in
practice worked just fine (at least in my usage of it), and probably
some other functionality.

One idea is to flip GL_VENDOR to some random string if chromium is
running. I don't like this idea, but I also don't have any great
alternatives. We can also just take this, as yet-another nail in the
nouveau coffin.

Opinions welcome.

Cheers,

  -ilia
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-01-08  6:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-06  3:36 Chromium - Application-level nouveau blacklist K. York
     [not found] ` <CABeNrKXkSyyETKH2gz_6-L+F1ptWHqCZ3eg=Cwjpu73RodFSuA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-01-06  4:05   ` Ilia Mirkin
  -- strict thread matches above, loose matches on Subject: below --
2019-01-05 20:40 Ilia Mirkin
2019-01-06  7:37 ` Jason Ekstrand
2019-01-07  7:06   ` Tapani Pälli
2019-01-08  6:56   ` Stéphane Marchesin
2019-01-06 13:52 ` Rob Clark

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.