All of lore.kernel.org
 help / color / mirror / Atom feed
* RFC: Drm-connector properties managed by another driver / privacy screen support
@ 2020-04-15  9:42 Hans de Goede
  2020-04-15  9:52 ` Daniel Vetter
  2020-04-15 15:28 ` Jani Nikula
  0 siblings, 2 replies; 54+ messages in thread
From: Hans de Goede @ 2020-04-15  9:42 UTC (permalink / raw)
  To: Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, Daniel Vetter
  Cc: Benjamin Berg, Christian Kellner, Javier Martinez Canillas,
	dri-devel, Mark Pearson, Rajat Jain, Nitin Joshi1

Hi All,

Somewhat long mail, so I've divided it into "chapters", sorry.

1. Back ground info
-------------------

First some quick background, some recent Lenovo models have
a builtin privacy screen which can be turned on/off in software
and the kernel recently has gotten support for this:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=110ea1d833ad291272d52e0a40a06157a3d9ba17

We have been looking into adding support for this to GNOME,
but the userspace API added by the above commit is very
Thinkpad specific, and we would rather not rely on an
userspace API which is specific to 1 series of laptops.

When we started discussing this I had already seen some versions
of Rajat's "drm/i915 Support for integrated privacy screen" series:

https://patchwork.freedesktop.org/series/74650/

Which adds support for integrated privacy screens
as a drm-connector property. Anyone who has been involved
in the backlight brightness control saga we have with
the sysfs backlight class instantly knows/feels that
this is the right way to handle this.

So now we want to export the Thinkpad lcdshadow
attribute as a drm-connector property.


2. Problem + Possible solutions
-------------------------------

The problem is that the code to get/set the lcdshadow
setting and also the code to listen for firmware (hardcoded
hotkeys) triggered state changes all lives inside the thinkpad_acpi
driver; and to export the lcdshadow setting as a drm property
we need to access that code (and it is too big to just copy
over).

One thing which makes this trickier is that all properties must
be attached to the connector before it is registered, we cannot
add it at a later time.

I see 3 possible solutions here:

i. Export some symbols from thinkpad_acpi and directly call these
from drm_connector_create_standard_properties and other
drm_connector functions if the thinkpad_acpi module is enabled.
Note this should be done in the core drm_connector functions since
the GPU might be one of i915 / amdgpu / nouveau. I believe that
it is clear that this solution is not very elegant.

A variant of this would be to have a separate helper module
(probaly a few static inlines in a .h) which exports some hooks for
i915 / amdgpu / nouveau to call this way we at least keep the
ugliness out of the core and keep the module-dep on thinkpad_acpi
limited to the i915 / amdgpu / nouveau modules. This might
actually not be too bad, except that currently the thinkpad_acpi
module refuses to load on non thinkpads...


ii. Currently the "privacy-screen" property added by Rajat's
patch-set is an enum with 2 possible values:
"Enabled"
"Disabled"

We could add a third value "Not Available", which would be the
default and then for internal panels always add the property
so that we avoid the problem that detecting if the laptop has
an internal privacy screen needs to be done before the connector
is registered. Then we can add some hooks which allow an
lcdshadow-driver to register itself against a connector later
(which is non trivial wrt probe order, but lets ignore that for now).


iii. We can add a generic framework to allow drivers outside
of the drm-subsys to register something drm_property(ish) specifying
a dev_name() and connector-name to which to attach the property
when that connector gets created on that device.

This requires the driver registering this property to be fully
loaded before the connector gets registered.


3. Picking a solution
---------------------

i. is just ugly, full stop, but also very KISS which still makes
it somewhat appealing. Also i. does not scale if we get other
vendor specific interfaces for interacting with these
privacy screens.


ii. is ugly from an userspace API pov, if there is no
"privacy-screen" then we really should not have the property at
all rather then setting it to "Not Available". Still it might be
an acceptable compromise I guess


iii. is/was the one I liked, until I started looking at the
drm_connector code and saw that all properties must be attached
before registering the connector, bummer. Then this morning I
remembered that the i915 driver has a similar issue with the
gvt stuff / the kvmgt module. The kvmgt module must be loaded
and fully initialized before the i915 driver loads. This has
been solved with module softdeps.

I think that we can do the same here having either the
i915+nouveau+amdgpu drivers; or the drm-core have a softdep on
thinkpad_acpi so that it can register the lcdshadow property
with the to-be-written framework for externally managed props
before the internal panels drm-connector gets created.

This also allows the thinkpad_acpi module_init function to
return -ENODEV on non Thinkpad devices, so that it will
not take up memory, where as with i. we would always need to
have it in memory.

I'm currently leaning towards iii. combined with using
MODULE_SOFTDEP("pre: thinkpad_acpi") to make sure the thinkpad_acpi
driver can do its thing before the internal display connector gets
created.

Daniel (and others) does this (iii. + softdeps) sound like
something which might be acceptable (I know it will also
depend on the resulting code implementing this idea) ?

Any other ideas / suggestions are appreciated here.

Regards,

Hans

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15  9:42 RFC: Drm-connector properties managed by another driver / privacy screen support Hans de Goede
@ 2020-04-15  9:52 ` Daniel Vetter
  2020-04-15 10:11   ` Hans de Goede
  2020-04-15 15:28 ` Jani Nikula
  1 sibling, 1 reply; 54+ messages in thread
From: Daniel Vetter @ 2020-04-15  9:52 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Benjamin Berg, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson

On Wed, Apr 15, 2020 at 11:42 AM Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi All,
>
> Somewhat long mail, so I've divided it into "chapters", sorry.
>
> 1. Back ground info
> -------------------
>
> First some quick background, some recent Lenovo models have
> a builtin privacy screen which can be turned on/off in software
> and the kernel recently has gotten support for this:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=110ea1d833ad291272d52e0a40a06157a3d9ba17
>
> We have been looking into adding support for this to GNOME,
> but the userspace API added by the above commit is very
> Thinkpad specific, and we would rather not rely on an
> userspace API which is specific to 1 series of laptops.
>
> When we started discussing this I had already seen some versions
> of Rajat's "drm/i915 Support for integrated privacy screen" series:
>
> https://patchwork.freedesktop.org/series/74650/
>
> Which adds support for integrated privacy screens
> as a drm-connector property. Anyone who has been involved
> in the backlight brightness control saga we have with
> the sysfs backlight class instantly knows/feels that
> this is the right way to handle this.
>
> So now we want to export the Thinkpad lcdshadow
> attribute as a drm-connector property.
>
>
> 2. Problem + Possible solutions
> -------------------------------
>
> The problem is that the code to get/set the lcdshadow
> setting and also the code to listen for firmware (hardcoded
> hotkeys) triggered state changes all lives inside the thinkpad_acpi
> driver; and to export the lcdshadow setting as a drm property
> we need to access that code (and it is too big to just copy
> over).
>
> One thing which makes this trickier is that all properties must
> be attached to the connector before it is registered, we cannot
> add it at a later time.
>
> I see 3 possible solutions here:
>
> i. Export some symbols from thinkpad_acpi and directly call these
> from drm_connector_create_standard_properties and other
> drm_connector functions if the thinkpad_acpi module is enabled.
> Note this should be done in the core drm_connector functions since
> the GPU might be one of i915 / amdgpu / nouveau. I believe that
> it is clear that this solution is not very elegant.
>
> A variant of this would be to have a separate helper module
> (probaly a few static inlines in a .h) which exports some hooks for
> i915 / amdgpu / nouveau to call this way we at least keep the
> ugliness out of the core and keep the module-dep on thinkpad_acpi
> limited to the i915 / amdgpu / nouveau modules. This might
> actually not be too bad, except that currently the thinkpad_acpi
> module refuses to load on non thinkpads...
>
>
> ii. Currently the "privacy-screen" property added by Rajat's
> patch-set is an enum with 2 possible values:
> "Enabled"
> "Disabled"
>
> We could add a third value "Not Available", which would be the
> default and then for internal panels always add the property
> so that we avoid the problem that detecting if the laptop has
> an internal privacy screen needs to be done before the connector
> is registered. Then we can add some hooks which allow an
> lcdshadow-driver to register itself against a connector later
> (which is non trivial wrt probe order, but lets ignore that for now).
>
>
> iii. We can add a generic framework to allow drivers outside
> of the drm-subsys to register something drm_property(ish) specifying
> a dev_name() and connector-name to which to attach the property
> when that connector gets created on that device.
>
> This requires the driver registering this property to be fully
> loaded before the connector gets registered.

iv. What every SoC subsystem does:

- lcdshadow drivers register drivers
- drm drivers look them up
- if stuff isn't there yet, we delay loading with EPROBE_DEFER until
the entire thing is assembled.

That's what we're doing already for other standardized components like
drm_bridge or drm_panel, and I think that's also the right approach
for backlight and anything else like that. Hand-rolling our own
EPROBE_DEFER handling, or some other duct-tape monsters imo just leads
to real pain. Also, with EPROBE_DEFER we have one standard way of
building a driver from component, which spans subsystems and is also
the underlying magic that makes stuff like component.c work.

Wrt the actual userspace interface, I think the drm core should handle
this as much as possible. Same way we let drm core handle a lot of the
atomic property stuff, to make sure things are standardized. So

- add an lcdshadow pointer to struct drm_connector
- implement the property glue code in drm core completely, at least
for the read side
- for the write side we might want to have some drm wrappers drivers
can call to at the appropriate times to e.g. restore privacy screen
settings when the panel gets enabled. In case that's needed.

Also one thing that the EPROBE_DEFER stuff forces us to handle
correctly is to track these depencies. That's the other nightmare in
backlight land, essentially you have no idea of knowing (in userspace)
whether the backlight driver you want is actually loaded, resulting in
lots of fun. The kernel is the only thing that can know, and for hw
that's built-in there's really no excuse to not know. So a model where
stuff gets assembled after drm_dev_register() is imo just plain buggy.

This means that the lcdshadow subsystem needs to have some idea of
whether it needs a driver, so that it can correctly differentiate
between EPROBE_DEFER and ENOENT error cases. In the latter the driver
should continue loading ofc.

Cheers, Daniel

> 3. Picking a solution
> ---------------------
>
> i. is just ugly, full stop, but also very KISS which still makes
> it somewhat appealing. Also i. does not scale if we get other
> vendor specific interfaces for interacting with these
> privacy screens.
>
>
> ii. is ugly from an userspace API pov, if there is no
> "privacy-screen" then we really should not have the property at
> all rather then setting it to "Not Available". Still it might be
> an acceptable compromise I guess
>
>
> iii. is/was the one I liked, until I started looking at the
> drm_connector code and saw that all properties must be attached
> before registering the connector, bummer. Then this morning I
> remembered that the i915 driver has a similar issue with the
> gvt stuff / the kvmgt module. The kvmgt module must be loaded
> and fully initialized before the i915 driver loads. This has
> been solved with module softdeps.
>
> I think that we can do the same here having either the
> i915+nouveau+amdgpu drivers; or the drm-core have a softdep on
> thinkpad_acpi so that it can register the lcdshadow property
> with the to-be-written framework for externally managed props
> before the internal panels drm-connector gets created.
>
> This also allows the thinkpad_acpi module_init function to
> return -ENODEV on non Thinkpad devices, so that it will
> not take up memory, where as with i. we would always need to
> have it in memory.
>
> I'm currently leaning towards iii. combined with using
> MODULE_SOFTDEP("pre: thinkpad_acpi") to make sure the thinkpad_acpi
> driver can do its thing before the internal display connector gets
> created.
>
> Daniel (and others) does this (iii. + softdeps) sound like
> something which might be acceptable (I know it will also
> depend on the resulting code implementing this idea) ?
>
> Any other ideas / suggestions are appreciated here.
>
> Regards,
>
> Hans
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15  9:52 ` Daniel Vetter
@ 2020-04-15 10:11   ` Hans de Goede
  2020-04-15 10:22     ` Daniel Vetter
  0 siblings, 1 reply; 54+ messages in thread
From: Hans de Goede @ 2020-04-15 10:11 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Benjamin Berg, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson

Hi,

On 4/15/20 11:52 AM, Daniel Vetter wrote:
> On Wed, Apr 15, 2020 at 11:42 AM Hans de Goede <hdegoede@redhat.com> wrote:
>>
>> Hi All,
>>
>> Somewhat long mail, so I've divided it into "chapters", sorry.
>>
>> 1. Back ground info
>> -------------------
>>
>> First some quick background, some recent Lenovo models have
>> a builtin privacy screen which can be turned on/off in software
>> and the kernel recently has gotten support for this:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=110ea1d833ad291272d52e0a40a06157a3d9ba17
>>
>> We have been looking into adding support for this to GNOME,
>> but the userspace API added by the above commit is very
>> Thinkpad specific, and we would rather not rely on an
>> userspace API which is specific to 1 series of laptops.
>>
>> When we started discussing this I had already seen some versions
>> of Rajat's "drm/i915 Support for integrated privacy screen" series:
>>
>> https://patchwork.freedesktop.org/series/74650/
>>
>> Which adds support for integrated privacy screens
>> as a drm-connector property. Anyone who has been involved
>> in the backlight brightness control saga we have with
>> the sysfs backlight class instantly knows/feels that
>> this is the right way to handle this.
>>
>> So now we want to export the Thinkpad lcdshadow
>> attribute as a drm-connector property.
>>
>>
>> 2. Problem + Possible solutions
>> -------------------------------
>>
>> The problem is that the code to get/set the lcdshadow
>> setting and also the code to listen for firmware (hardcoded
>> hotkeys) triggered state changes all lives inside the thinkpad_acpi
>> driver; and to export the lcdshadow setting as a drm property
>> we need to access that code (and it is too big to just copy
>> over).
>>
>> One thing which makes this trickier is that all properties must
>> be attached to the connector before it is registered, we cannot
>> add it at a later time.
>>
>> I see 3 possible solutions here:
>>
>> i. Export some symbols from thinkpad_acpi and directly call these
>> from drm_connector_create_standard_properties and other
>> drm_connector functions if the thinkpad_acpi module is enabled.
>> Note this should be done in the core drm_connector functions since
>> the GPU might be one of i915 / amdgpu / nouveau. I believe that
>> it is clear that this solution is not very elegant.
>>
>> A variant of this would be to have a separate helper module
>> (probaly a few static inlines in a .h) which exports some hooks for
>> i915 / amdgpu / nouveau to call this way we at least keep the
>> ugliness out of the core and keep the module-dep on thinkpad_acpi
>> limited to the i915 / amdgpu / nouveau modules. This might
>> actually not be too bad, except that currently the thinkpad_acpi
>> module refuses to load on non thinkpads...
>>
>>
>> ii. Currently the "privacy-screen" property added by Rajat's
>> patch-set is an enum with 2 possible values:
>> "Enabled"
>> "Disabled"
>>
>> We could add a third value "Not Available", which would be the
>> default and then for internal panels always add the property
>> so that we avoid the problem that detecting if the laptop has
>> an internal privacy screen needs to be done before the connector
>> is registered. Then we can add some hooks which allow an
>> lcdshadow-driver to register itself against a connector later
>> (which is non trivial wrt probe order, but lets ignore that for now).
>>
>>
>> iii. We can add a generic framework to allow drivers outside
>> of the drm-subsys to register something drm_property(ish) specifying
>> a dev_name() and connector-name to which to attach the property
>> when that connector gets created on that device.
>>
>> This requires the driver registering this property to be fully
>> loaded before the connector gets registered.
> 
> iv. What every SoC subsystem does:
> 
> - lcdshadow drivers register drivers
> - drm drivers look them up
> - if stuff isn't there yet, we delay loading with EPROBE_DEFER until
> the entire thing is assembled.
> 
> That's what we're doing already for other standardized components like
> drm_bridge or drm_panel, and I think that's also the right approach
> for backlight and anything else like that. Hand-rolling our own
> EPROBE_DEFER handling, or some other duct-tape monsters imo just leads
> to real pain. Also, with EPROBE_DEFER we have one standard way of
> building a driver from component, which spans subsystems and is also
> the underlying magic that makes stuff like component.c work.

On the SoCs we have devicetree telling us what components there
are, so we can wait for them to show up. The only way to figure out
if the lcdshadow thing is there on a ThinkPad is asking thinkpad_acpi,
or duplicating a lot of code from thinkpad_acpi. Edit:
also see below for a possible solution.

> Wrt the actual userspace interface, I think the drm core should handle
> this as much as possible. Same way we let drm core handle a lot of the
> atomic property stuff, to make sure things are standardized.

Agreed.


> So
> 
> - add an lcdshadow pointer to struct drm_connector
> - implement the property glue code in drm core completely, at least
> for the read side
> - for the write side we might want to have some drm wrappers drivers
> can call to at the appropriate times to e.g. restore privacy screen
> settings when the panel gets enabled. In case that's needed.
> 
> Also one thing that the EPROBE_DEFER stuff forces us to handle
> correctly is to track these depencies. That's the other nightmare in
> backlight land, essentially you have no idea of knowing (in userspace)
> whether the backlight driver you want is actually loaded, resulting in
> lots of fun. The kernel is the only thing that can know, and for hw
> that's built-in there's really no excuse to not know. So a model where
> stuff gets assembled after drm_dev_register() is imo just plain buggy.
> 
> This means that the lcdshadow subsystem needs to have some idea of
> whether it needs a driver, so that it can correctly differentiate
> between EPROBE_DEFER and ENOENT error cases. In the latter the driver
> should continue loading ofc.

Right, so how would the lcdshadow subsystem do this? The only way
would be for it to say try and modprobe thinkpad_acpi from its
core init function (if thinkpad_acpi is enabled).  IOW it is the usual
x86 mess.  I guess we could have something like this in a theoretical
to be added lcdshadow subsystem:

/* Add comment explaining why we need this messy stuff here */
const char * const shadow_providers[] = {
#ifdef CONFIG_THINKPAD_ACPI_MODULE
	"thinkpad_acpi",
#endif
#ifdef CONFIG_OTHER_MODULE
	"other",
#endif
	NULL
};

int module_init(void)
{
	/* do actual setup of the ?class? */

	for (i = 0; shadow_providers[i]; i++)
		request_module(shadow_providers[i]);

	return 0;
}

And then simply have the function which gets a lcd_shadow provider
provide -ENOENT if there are none.

One problem here is that this will work for modules since
the drm-core will depend on modules from the lcdshadow-core,
so the lcdshadow-core module will get loaded first and will
then try to load thinkpad_acpi, which will return with -ENODEV
from its module_init on non ThinkPads. We could even put the
request_module loop in some other init_once function so that
things will also work when the lcdshadow-core is builtin.

But how do we ensure that thinkpad_acpi will get to register
its lcdshadow before say i915 gets probed if everything is builtin?

I guess my SOFTDEP solution has the same issue though. Do you
know has this is dealt with for kvmgt ?

Regards,

Hans



>> 3. Picking a solution
>> ---------------------
>>
>> i. is just ugly, full stop, but also very KISS which still makes
>> it somewhat appealing. Also i. does not scale if we get other
>> vendor specific interfaces for interacting with these
>> privacy screens.
>>
>>
>> ii. is ugly from an userspace API pov, if there is no
>> "privacy-screen" then we really should not have the property at
>> all rather then setting it to "Not Available". Still it might be
>> an acceptable compromise I guess
>>
>>
>> iii. is/was the one I liked, until I started looking at the
>> drm_connector code and saw that all properties must be attached
>> before registering the connector, bummer. Then this morning I
>> remembered that the i915 driver has a similar issue with the
>> gvt stuff / the kvmgt module. The kvmgt module must be loaded
>> and fully initialized before the i915 driver loads. This has
>> been solved with module softdeps.
>>
>> I think that we can do the same here having either the
>> i915+nouveau+amdgpu drivers; or the drm-core have a softdep on
>> thinkpad_acpi so that it can register the lcdshadow property
>> with the to-be-written framework for externally managed props
>> before the internal panels drm-connector gets created.
>>
>> This also allows the thinkpad_acpi module_init function to
>> return -ENODEV on non Thinkpad devices, so that it will
>> not take up memory, where as with i. we would always need to
>> have it in memory.
>>
>> I'm currently leaning towards iii. combined with using
>> MODULE_SOFTDEP("pre: thinkpad_acpi") to make sure the thinkpad_acpi
>> driver can do its thing before the internal display connector gets
>> created.
>>
>> Daniel (and others) does this (iii. + softdeps) sound like
>> something which might be acceptable (I know it will also
>> depend on the resulting code implementing this idea) ?
>>
>> Any other ideas / suggestions are appreciated here.
>>
>> Regards,
>>
>> Hans
>>
> 
> 

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 10:11   ` Hans de Goede
@ 2020-04-15 10:22     ` Daniel Vetter
  2020-04-15 11:39       ` Hans de Goede
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel Vetter @ 2020-04-15 10:22 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Benjamin Berg, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson

On Wed, Apr 15, 2020 at 12:11 PM Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi,
>
> On 4/15/20 11:52 AM, Daniel Vetter wrote:
> > On Wed, Apr 15, 2020 at 11:42 AM Hans de Goede <hdegoede@redhat.com> wrote:
> >>
> >> Hi All,
> >>
> >> Somewhat long mail, so I've divided it into "chapters", sorry.
> >>
> >> 1. Back ground info
> >> -------------------
> >>
> >> First some quick background, some recent Lenovo models have
> >> a builtin privacy screen which can be turned on/off in software
> >> and the kernel recently has gotten support for this:
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=110ea1d833ad291272d52e0a40a06157a3d9ba17
> >>
> >> We have been looking into adding support for this to GNOME,
> >> but the userspace API added by the above commit is very
> >> Thinkpad specific, and we would rather not rely on an
> >> userspace API which is specific to 1 series of laptops.
> >>
> >> When we started discussing this I had already seen some versions
> >> of Rajat's "drm/i915 Support for integrated privacy screen" series:
> >>
> >> https://patchwork.freedesktop.org/series/74650/
> >>
> >> Which adds support for integrated privacy screens
> >> as a drm-connector property. Anyone who has been involved
> >> in the backlight brightness control saga we have with
> >> the sysfs backlight class instantly knows/feels that
> >> this is the right way to handle this.
> >>
> >> So now we want to export the Thinkpad lcdshadow
> >> attribute as a drm-connector property.
> >>
> >>
> >> 2. Problem + Possible solutions
> >> -------------------------------
> >>
> >> The problem is that the code to get/set the lcdshadow
> >> setting and also the code to listen for firmware (hardcoded
> >> hotkeys) triggered state changes all lives inside the thinkpad_acpi
> >> driver; and to export the lcdshadow setting as a drm property
> >> we need to access that code (and it is too big to just copy
> >> over).
> >>
> >> One thing which makes this trickier is that all properties must
> >> be attached to the connector before it is registered, we cannot
> >> add it at a later time.
> >>
> >> I see 3 possible solutions here:
> >>
> >> i. Export some symbols from thinkpad_acpi and directly call these
> >> from drm_connector_create_standard_properties and other
> >> drm_connector functions if the thinkpad_acpi module is enabled.
> >> Note this should be done in the core drm_connector functions since
> >> the GPU might be one of i915 / amdgpu / nouveau. I believe that
> >> it is clear that this solution is not very elegant.
> >>
> >> A variant of this would be to have a separate helper module
> >> (probaly a few static inlines in a .h) which exports some hooks for
> >> i915 / amdgpu / nouveau to call this way we at least keep the
> >> ugliness out of the core and keep the module-dep on thinkpad_acpi
> >> limited to the i915 / amdgpu / nouveau modules. This might
> >> actually not be too bad, except that currently the thinkpad_acpi
> >> module refuses to load on non thinkpads...
> >>
> >>
> >> ii. Currently the "privacy-screen" property added by Rajat's
> >> patch-set is an enum with 2 possible values:
> >> "Enabled"
> >> "Disabled"
> >>
> >> We could add a third value "Not Available", which would be the
> >> default and then for internal panels always add the property
> >> so that we avoid the problem that detecting if the laptop has
> >> an internal privacy screen needs to be done before the connector
> >> is registered. Then we can add some hooks which allow an
> >> lcdshadow-driver to register itself against a connector later
> >> (which is non trivial wrt probe order, but lets ignore that for now).
> >>
> >>
> >> iii. We can add a generic framework to allow drivers outside
> >> of the drm-subsys to register something drm_property(ish) specifying
> >> a dev_name() and connector-name to which to attach the property
> >> when that connector gets created on that device.
> >>
> >> This requires the driver registering this property to be fully
> >> loaded before the connector gets registered.
> >
> > iv. What every SoC subsystem does:
> >
> > - lcdshadow drivers register drivers
> > - drm drivers look them up
> > - if stuff isn't there yet, we delay loading with EPROBE_DEFER until
> > the entire thing is assembled.
> >
> > That's what we're doing already for other standardized components like
> > drm_bridge or drm_panel, and I think that's also the right approach
> > for backlight and anything else like that. Hand-rolling our own
> > EPROBE_DEFER handling, or some other duct-tape monsters imo just leads
> > to real pain. Also, with EPROBE_DEFER we have one standard way of
> > building a driver from component, which spans subsystems and is also
> > the underlying magic that makes stuff like component.c work.
>
> On the SoCs we have devicetree telling us what components there
> are, so we can wait for them to show up. The only way to figure out
> if the lcdshadow thing is there on a ThinkPad is asking thinkpad_acpi,
> or duplicating a lot of code from thinkpad_acpi. Edit:
> also see below for a possible solution.

Yup it sucks. I think all we can do is have a small acpi match
function (which yes will duplicate some of the thinkpad_acpi driver
logic) to re-create that information and give us a "should we have a
lcdshadow driver for this $pci_device" answer.

This need for an in-kernel source of truth for "which backlight, if
any, do I need" is why the backlight stuff never got fixed. It's a
really hard problem, and doing the table flip and just letting
userspace deal with the mess at least avoids having to face the fact
that the kernel is totally failing here. It's made worse for backlight
because of 20 years of hacked up systems that "work", and we can't
regress any of them.

I really want to avoid that situation for anything new like lcdshadow.

> > Wrt the actual userspace interface, I think the drm core should handle
> > this as much as possible. Same way we let drm core handle a lot of the
> > atomic property stuff, to make sure things are standardized.
>
> Agreed.
>
>
> > So
> >
> > - add an lcdshadow pointer to struct drm_connector
> > - implement the property glue code in drm core completely, at least
> > for the read side
> > - for the write side we might want to have some drm wrappers drivers
> > can call to at the appropriate times to e.g. restore privacy screen
> > settings when the panel gets enabled. In case that's needed.
> >
> > Also one thing that the EPROBE_DEFER stuff forces us to handle
> > correctly is to track these depencies. That's the other nightmare in
> > backlight land, essentially you have no idea of knowing (in userspace)
> > whether the backlight driver you want is actually loaded, resulting in
> > lots of fun. The kernel is the only thing that can know, and for hw
> > that's built-in there's really no excuse to not know. So a model where
> > stuff gets assembled after drm_dev_register() is imo just plain buggy.
> >
> > This means that the lcdshadow subsystem needs to have some idea of
> > whether it needs a driver, so that it can correctly differentiate
> > between EPROBE_DEFER and ENOENT error cases. In the latter the driver
> > should continue loading ofc.
>
> Right, so how would the lcdshadow subsystem do this? The only way
> would be for it to say try and modprobe thinkpad_acpi from its
> core init function (if thinkpad_acpi is enabled).  IOW it is the usual
> x86 mess.  I guess we could have something like this in a theoretical
> to be added lcdshadow subsystem:
>
> /* Add comment explaining why we need this messy stuff here */
> const char * const shadow_providers[] = {
> #ifdef CONFIG_THINKPAD_ACPI_MODULE
>         "thinkpad_acpi",
> #endif
> #ifdef CONFIG_OTHER_MODULE
>         "other",
> #endif
>         NULL
> };
>
> int module_init(void)
> {
>         /* do actual setup of the ?class? */
>
>         for (i = 0; shadow_providers[i]; i++)
>                 request_module(shadow_providers[i]);
>
>         return 0;
> }

Hm I think explicitly loading drivers feels very much not device model
like. Don't these drivers auto-load, matching on acpi functions? I
guess if that doesn't exist, then we'd need to fix that one first :-/
In general no request_module please, that shouldn't be needed either.

The trouble with request_module is also that (afaiui) it doesn't
really work well with parallel module load and all that, for
EPROBE_DEFER to work we do need to be able to answer "should we have a
driver?" independently of whether that driver has loaded already or
not.

> And then simply have the function which gets a lcd_shadow provider
> provide -ENOENT if there are none.
>
> One problem here is that this will work for modules since
> the drm-core will depend on modules from the lcdshadow-core,
> so the lcdshadow-core module will get loaded first and will
> then try to load thinkpad_acpi, which will return with -ENODEV
> from its module_init on non ThinkPads. We could even put the
> request_module loop in some other init_once function so that
> things will also work when the lcdshadow-core is builtin.
>
> But how do we ensure that thinkpad_acpi will get to register
> its lcdshadow before say i915 gets probed if everything is builtin?

EPROBE_DEFER. Everyone hates it, but it does work. It's really kinda
neat magic. The only pain point is that you might need to wire
EPROBE_DEFER through a few layers in i915, but we do have a lot of
that in place already, plus we have solid error-injecting load error
tests in igt. So that part shouldn't be a big deal.

> I guess my SOFTDEP solution has the same issue though. Do you
> know has this is dealt with for kvmgt ?

kvmgt? What do you mean there? kvmgt is just a user of i915-gem, if
you enable it in the config (and module option too iirc) then it works
more like an additional uapi layer, similar to how we handle fbdev
emulation. So totally different scenario, since the main driver is
100% in control of the situation and controls the register/unregister
lifetime cycles completely. There's no indepedent part somewhere else
for kvmgt or fbdev emulation.

Or I'm totally misunderstanding what you mean with this example here.

Cheers, Daniel

>
> Regards,
>
> Hans
>
>
>
> >> 3. Picking a solution
> >> ---------------------
> >>
> >> i. is just ugly, full stop, but also very KISS which still makes
> >> it somewhat appealing. Also i. does not scale if we get other
> >> vendor specific interfaces for interacting with these
> >> privacy screens.
> >>
> >>
> >> ii. is ugly from an userspace API pov, if there is no
> >> "privacy-screen" then we really should not have the property at
> >> all rather then setting it to "Not Available". Still it might be
> >> an acceptable compromise I guess
> >>
> >>
> >> iii. is/was the one I liked, until I started looking at the
> >> drm_connector code and saw that all properties must be attached
> >> before registering the connector, bummer. Then this morning I
> >> remembered that the i915 driver has a similar issue with the
> >> gvt stuff / the kvmgt module. The kvmgt module must be loaded
> >> and fully initialized before the i915 driver loads. This has
> >> been solved with module softdeps.
> >>
> >> I think that we can do the same here having either the
> >> i915+nouveau+amdgpu drivers; or the drm-core have a softdep on
> >> thinkpad_acpi so that it can register the lcdshadow property
> >> with the to-be-written framework for externally managed props
> >> before the internal panels drm-connector gets created.
> >>
> >> This also allows the thinkpad_acpi module_init function to
> >> return -ENODEV on non Thinkpad devices, so that it will
> >> not take up memory, where as with i. we would always need to
> >> have it in memory.
> >>
> >> I'm currently leaning towards iii. combined with using
> >> MODULE_SOFTDEP("pre: thinkpad_acpi") to make sure the thinkpad_acpi
> >> driver can do its thing before the internal display connector gets
> >> created.
> >>
> >> Daniel (and others) does this (iii. + softdeps) sound like
> >> something which might be acceptable (I know it will also
> >> depend on the resulting code implementing this idea) ?
> >>
> >> Any other ideas / suggestions are appreciated here.
> >>
> >> Regards,
> >>
> >> Hans
> >>
> >
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 10:22     ` Daniel Vetter
@ 2020-04-15 11:39       ` Hans de Goede
  2020-04-15 11:56         ` Hans de Goede
  2020-04-15 12:01         ` Daniel Vetter
  0 siblings, 2 replies; 54+ messages in thread
From: Hans de Goede @ 2020-04-15 11:39 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Benjamin Berg, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson

Hi,

On 4/15/20 12:22 PM, Daniel Vetter wrote:
> On Wed, Apr 15, 2020 at 12:11 PM Hans de Goede <hdegoede@redhat.com> wrote:
>>
>> Hi,
>>
>> On 4/15/20 11:52 AM, Daniel Vetter wrote:

<snip>

>>> iv. What every SoC subsystem does:
>>>
>>> - lcdshadow drivers register drivers
>>> - drm drivers look them up
>>> - if stuff isn't there yet, we delay loading with EPROBE_DEFER until
>>> the entire thing is assembled.
>>>
>>> That's what we're doing already for other standardized components like
>>> drm_bridge or drm_panel, and I think that's also the right approach
>>> for backlight and anything else like that. Hand-rolling our own
>>> EPROBE_DEFER handling, or some other duct-tape monsters imo just leads
>>> to real pain. Also, with EPROBE_DEFER we have one standard way of
>>> building a driver from component, which spans subsystems and is also
>>> the underlying magic that makes stuff like component.c work.
>>
>> On the SoCs we have devicetree telling us what components there
>> are, so we can wait for them to show up. The only way to figure out
>> if the lcdshadow thing is there on a ThinkPad is asking thinkpad_acpi,
>> or duplicating a lot of code from thinkpad_acpi. Edit:
>> also see below for a possible solution.
> 
> Yup it sucks. I think all we can do is have a small acpi match
> function (which yes will duplicate some of the thinkpad_acpi driver
> logic) to re-create that information and give us a "should we have a
> lcdshadow driver for this $pci_device" answer.

Ok, so questions about this solution:

1. Where should that match-function live ?

2. An acpi_thinkpad derived match-function will only be able to
    answer if there is an lcdshadow device/driver for the internal
    panel. It will not be able to tie this info to a certain PCI
    device. My plan is to pass NULL as dev_name when registering
    the lcdshadow-device and have lcdshadow_get(dev, <connector-name>)
    skip device-name matching (consider everything a match) for
    lcdshadow-devices registered with NULL as dev_name.

    So I guess in this case the mini match function should just
    ignore the passed in device?

> This need for an in-kernel source of truth for "which backlight, if
> any, do I need" is why the backlight stuff never got fixed. It's a
> really hard problem, and doing the table flip and just letting
> userspace deal with the mess at least avoids having to face the fact
> that the kernel is totally failing here. It's made worse for backlight
> because of 20 years of hacked up systems that "work", and we can't
> regress any of them.

Right, as discussed during last plumbers Christian Kellner and I
have written a plan to slowly resolve this. Unfortunately Christian
has not found the time to work on this yet.

> I really want to avoid that situation for anything new like lcdshadow.

Ack.

>>> Wrt the actual userspace interface, I think the drm core should handle
>>> this as much as possible. Same way we let drm core handle a lot of the
>>> atomic property stuff, to make sure things are standardized.
>>
>> Agreed.
>>
>>
>>> So
>>>
>>> - add an lcdshadow pointer to struct drm_connector
>>> - implement the property glue code in drm core completely, at least
>>> for the read side
>>> - for the write side we might want to have some drm wrappers drivers
>>> can call to at the appropriate times to e.g. restore privacy screen
>>> settings when the panel gets enabled. In case that's needed.
>>>
>>> Also one thing that the EPROBE_DEFER stuff forces us to handle
>>> correctly is to track these depencies. That's the other nightmare in
>>> backlight land, essentially you have no idea of knowing (in userspace)
>>> whether the backlight driver you want is actually loaded, resulting in
>>> lots of fun. The kernel is the only thing that can know, and for hw
>>> that's built-in there's really no excuse to not know. So a model where
>>> stuff gets assembled after drm_dev_register() is imo just plain buggy.
>>>
>>> This means that the lcdshadow subsystem needs to have some idea of
>>> whether it needs a driver, so that it can correctly differentiate
>>> between EPROBE_DEFER and ENOENT error cases. In the latter the driver
>>> should continue loading ofc.
>>
>> Right, so how would the lcdshadow subsystem do this? The only way
>> would be for it to say try and modprobe thinkpad_acpi from its
>> core init function (if thinkpad_acpi is enabled).  IOW it is the usual
>> x86 mess.  I guess we could have something like this in a theoretical
>> to be added lcdshadow subsystem:
>>
>> /* Add comment explaining why we need this messy stuff here */
>> const char * const shadow_providers[] = {
>> #ifdef CONFIG_THINKPAD_ACPI_MODULE
>>          "thinkpad_acpi",
>> #endif
>> #ifdef CONFIG_OTHER_MODULE
>>          "other",
>> #endif
>>          NULL
>> };
>>
>> int module_init(void)
>> {
>>          /* do actual setup of the ?class? */
>>
>>          for (i = 0; shadow_providers[i]; i++)
>>                  request_module(shadow_providers[i]);
>>
>>          return 0;
>> }
> 
> Hm I think explicitly loading drivers feels very much not device model
> like. Don't these drivers auto-load, matching on acpi functions?

thinkpad_acpi does autoload based on a number of ACPI device-ids,
the idea behind the above request_module is to avoid the need
to have the acpi-match function you mentioned above.

Basically what would happen is e.g. :

1. i915 loads, calls lcdshadow_get(dev, "eDP-1");
2. if this is the first lcdshadow_get() call then
    the lcdshadow core will do the request_module calls,
    allowing any of these modules to get loaded + probed and
    call e.g. lcdshadow_register(&mylcdshadowdev, <gfx-adapter-dev-name>, "eDP-1");
3. After the request modules the lcdshadow_get() will walk over
    the list of registered devices, including the ones just registered
    by the request_module calls and then hopefully find a match

So by doing the request-module calls before checking for
a matching lcdshadow dev, we avoid the need of having some of
the knowledge currently abstracted away in the thinkpad_acpi driver
duplicated inside the drm code somewhere.

> I guess if that doesn't exist, then we'd need to fix that one first :-/
> In general no request_module please, that shouldn't be needed either.
> 
> The trouble with request_module is also that (afaiui) it doesn't
> really work well with parallel module load and all that, for
> EPROBE_DEFER to work we do need to be able to answer "should we have a
> driver?" independently of whether that driver has loaded already or
> not.

The idea here is to avoid using EPROBE_DEFER (on x86 at least)
and either directly return the lcdshadow_dev or ENOENT. Also
see below.

>> And then simply have the function which gets a lcd_shadow provider
>> provide -ENOENT if there are none.
>>
>> One problem here is that this will work for modules since
>> the drm-core will depend on modules from the lcdshadow-core,
>> so the lcdshadow-core module will get loaded first and will
>> then try to load thinkpad_acpi, which will return with -ENODEV
>> from its module_init on non ThinkPads. We could even put the
>> request_module loop in some other init_once function so that
>> things will also work when the lcdshadow-core is builtin.
>>
>> But how do we ensure that thinkpad_acpi will get to register
>> its lcdshadow before say i915 gets probed if everything is builtin?
> 
> EPROBE_DEFER. Everyone hates it, but it does work. It's really kinda
> neat magic. The only pain point is that you might need to wire
> EPROBE_DEFER through a few layers in i915, but we do have a lot of
> that in place already,

AFAIK we do not have any EPROBE_DEFER handling in i915 in place at
all! There are _0_ checks for an EPROBE_DEFER return in all of the i915
code. We actually have a similar problem with backlight control when
controlled by an external PWM controller such as the PWM controller
of the Crystal Cove PMIC found on some BYT/CHT drivers or the
PWM controller found in the LPSS bits of BYT/CHT SoCs.

Since again we lack a clear hardware model like device tree,
we use lookup tables for the (external) PWM controllers on x86,
see the pwm_add_table calls in drivers/acpi/acpi_lpss.c
and drivers/mfd/intel_soc_pmic_core.c and the pwm_get call
in the i915 code simply continues on its happy way without
backlight control if the pwm_get fails, rather then dealing
with -EPROBE_DEFER. I looked into untangling this back then
but the i915 code really is not ready to unroll everything
it has done already once we are probing connectors.

I actually "fixed" the PWM issue by:

1. Adding a module-name field to the PWM lookup table
registered by the pwm_add_table calls.

2. Have the PWM core call request_module on that module_name
if it finds a match in the registered lookup_table (which
get registered early on), but the matching pwm_dev has not
been registered yet.

So I agree with you that ideally i915 would handle EPROBE_DEFER
but atm AFAIK it really does not handle that at all and
we are pretty far away from getting to a point where it
does handle that.

Assuming we are going to add some device/model specific
lcdshadow knowledge inside the lcdshadow core as you
suggested with your "small acpi match function" above,
we could do something similar to what the vga_switcheroo
code is doing for this and have a lcdshadow_defer_probe()
helper and call that really early in i915_pci_probe(),
which currently already has this for the vgaswitcheroo case:

         if (vga_switcheroo_client_probe_defer(pdev))
                 return -EPROBE_DEFER;

The reason why I suggested the request_module approach
is because as said it will allow lcdshadow_get()
to return either a device or -ENOENT and never -EPROBE_DEFER
and currently none of the i915 code, nor the nouveau code
nor the amdgpu code has any EPROBE_DEFER checks. They all
assume that once there probe function is called they can
continue on forward without unrolling and exiting from
probe with EPROBE_DEFER ever. I agree that that is somewhat
of a bad assumption now a days but fixing that is going to
be massive undertakig and I'm afraid that trying to deal
with it will lead to many many regressions.

So I would rather work around it by using request_module.

> plus we have solid error-injecting load error
> tests in igt. So that part shouldn't be a big deal.
> 
>> I guess my SOFTDEP solution has the same issue though. Do you
>> know has this is dealt with for kvmgt ?
> 
> kvmgt? What do you mean there? kvmgt is just a user of i915-gem, if
> you enable it in the config (and module option too iirc) then it works
> more like an additional uapi layer, similar to how we handle fbdev
> emulation. So totally different scenario, since the main driver is
> 100% in control of the situation and controls the register/unregister
> lifetime cycles completely. There's no indepedent part somewhere else
> for kvmgt or fbdev emulation.
> 
> Or I'm totally misunderstanding what you mean with this example here.

The i915 driver used to have a:

MODULE_SOFTDEP("pre: kvmgt")

But it seems that that has been replaced with some mechanism
or maybe the need for kvmgt to be loaded first has been removed/

Regards,

Hans

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 11:39       ` Hans de Goede
@ 2020-04-15 11:56         ` Hans de Goede
  2020-04-15 12:01         ` Daniel Vetter
  1 sibling, 0 replies; 54+ messages in thread
From: Hans de Goede @ 2020-04-15 11:56 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Benjamin Berg, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson

Hi,

On 4/15/20 1:39 PM, Hans de Goede wrote:

<snip>
>>> /* Add comment explaining why we need this messy stuff here */
>>> const char * const shadow_providers[] = {
>>> #ifdef CONFIG_THINKPAD_ACPI_MODULE
>>>          "thinkpad_acpi",
>>> #endif
>>> #ifdef CONFIG_OTHER_MODULE
>>>          "other",
>>> #endif
>>>          NULL
>>> };
>>>
>>> int module_init(void)
>>> {
>>>          /* do actual setup of the ?class? */
>>>
>>>          for (i = 0; shadow_providers[i]; i++)
>>>                  request_module(shadow_providers[i]);
>>>
>>>          return 0;
>>> }
>>
>> Hm I think explicitly loading drivers feels very much not device model
>> like. Don't these drivers auto-load, matching on acpi functions?
> 
> thinkpad_acpi does autoload based on a number of ACPI device-ids,
> the idea behind the above request_module is to avoid the need
> to have the acpi-match function you mentioned above.
> 
> Basically what would happen is e.g. :
> 
> 1. i915 loads, calls lcdshadow_get(dev, "eDP-1");
> 2. if this is the first lcdshadow_get() call then
>     the lcdshadow core will do the request_module calls,
>     allowing any of these modules to get loaded + probed and
>     call e.g. lcdshadow_register(&mylcdshadowdev, <gfx-adapter-dev-name>, "eDP-1");
> 3. After the request modules the lcdshadow_get() will walk over
>     the list of registered devices, including the ones just registered
>     by the request_module calls and then hopefully find a match
> 
> So by doing the request-module calls before checking for
> a matching lcdshadow dev, we avoid the need of having some of
> the knowledge currently abstracted away in the thinkpad_acpi driver
> duplicated inside the drm code somewhere.
> 
>> I guess if that doesn't exist, then we'd need to fix that one first :-/
>> In general no request_module please, that shouldn't be needed either.
>>
>> The trouble with request_module is also that (afaiui) it doesn't
>> really work well with parallel module load and all that, for
>> EPROBE_DEFER to work we do need to be able to answer "should we have a
>> driver?" independently of whether that driver has loaded already or
>> not.
> 
> The idea here is to avoid using EPROBE_DEFER (on x86 at least)
> and either directly return the lcdshadow_dev or ENOENT. Also
> see below.

<snip>

> Assuming we are going to add some device/model specific
> lcdshadow knowledge inside the lcdshadow core as you
> suggested with your "small acpi match function" above,
> we could do something similar to what the vga_switcheroo
> code is doing for this and have a lcdshadow_defer_probe()
> helper and call that really early in i915_pci_probe(),
> which currently already has this for the vgaswitcheroo case:
> 
>          if (vga_switcheroo_client_probe_defer(pdev))
>                  return -EPROBE_DEFER;

So thinking more about this and given the total lack of
EPROBE_DEFER handling in the 3 major X86 GPU/kms drivers
I think that adding a lcdshadow_defer_probe() helper is
the way to go. This will also avoid the need for duplicating
the lcdshadow detect functionality in the small ACPI-match
functions you mentioned (although that might still be
interesting to speedup the boot).

When everything is builtin then each enabled "module"-s
module_init function will get called, we can call a
lcdshadow_probe_done("module-name") function from those
and the lcdshadow core can then track if all potential
lcdhadow providers have initialized before it stops
returning non 0 from lcdshadow_defer_probe().

Or if we still do the small match functions it could
be even smarter with this...

And for the modular case it can call request_module on
all (enabled as module) potential lcdhadow providers
(or again we could rely on the small match function
instead).

Then (on x86 at least) we can have lcdshadow_get never
return -EPROBE_DEFER and avoid the need to solve the
lack of EPROBE_DEFER support in the 3 major x86 drivers.

And this is all kernel internal, so if that lack of
EPROBE_DEFER support ever gets fixed then we can drop
the lcdshadow_defer_probe() hack and make
lcdshadow_get also return -EPROBE_DEFER on x86 in
some cases.

Regards,

Hans

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 11:39       ` Hans de Goede
  2020-04-15 11:56         ` Hans de Goede
@ 2020-04-15 12:01         ` Daniel Vetter
  2020-04-15 13:02           ` Hans de Goede
  1 sibling, 1 reply; 54+ messages in thread
From: Daniel Vetter @ 2020-04-15 12:01 UTC (permalink / raw)
  To: Hans de Goede
  Cc: David Airlie, Christian Kellner, Javier Martinez Canillas,
	dri-devel, Thomas Zimmermann, Nitin Joshi1, Rajat Jain,
	Mark Pearson, Benjamin Berg

On Wed, Apr 15, 2020 at 01:39:23PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 4/15/20 12:22 PM, Daniel Vetter wrote:
> > On Wed, Apr 15, 2020 at 12:11 PM Hans de Goede <hdegoede@redhat.com> wrote:
> > > 
> > > Hi,
> > > 
> > > On 4/15/20 11:52 AM, Daniel Vetter wrote:
> 
> <snip>
> 
> > > > iv. What every SoC subsystem does:
> > > > 
> > > > - lcdshadow drivers register drivers
> > > > - drm drivers look them up
> > > > - if stuff isn't there yet, we delay loading with EPROBE_DEFER until
> > > > the entire thing is assembled.
> > > > 
> > > > That's what we're doing already for other standardized components like
> > > > drm_bridge or drm_panel, and I think that's also the right approach
> > > > for backlight and anything else like that. Hand-rolling our own
> > > > EPROBE_DEFER handling, or some other duct-tape monsters imo just leads
> > > > to real pain. Also, with EPROBE_DEFER we have one standard way of
> > > > building a driver from component, which spans subsystems and is also
> > > > the underlying magic that makes stuff like component.c work.
> > > 
> > > On the SoCs we have devicetree telling us what components there
> > > are, so we can wait for them to show up. The only way to figure out
> > > if the lcdshadow thing is there on a ThinkPad is asking thinkpad_acpi,
> > > or duplicating a lot of code from thinkpad_acpi. Edit:
> > > also see below for a possible solution.
> > 
> > Yup it sucks. I think all we can do is have a small acpi match
> > function (which yes will duplicate some of the thinkpad_acpi driver
> > logic) to re-create that information and give us a "should we have a
> > lcdshadow driver for this $pci_device" answer.
> 
> Ok, so questions about this solution:
> 
> 1. Where should that match-function live ?
> 
> 2. An acpi_thinkpad derived match-function will only be able to
>    answer if there is an lcdshadow device/driver for the internal
>    panel. It will not be able to tie this info to a certain PCI
>    device. My plan is to pass NULL as dev_name when registering
>    the lcdshadow-device and have lcdshadow_get(dev, <connector-name>)
>    skip device-name matching (consider everything a match) for
>    lcdshadow-devices registered with NULL as dev_name.
> 
>    So I guess in this case the mini match function should just
>    ignore the passed in device?

Yeah I think we can't really avoid that. I also expect that we'll need
ACPI and dt versions of this, and driver needs to know which one to call.
Since at least in a dt world the driver knows exactly for which dt node it
needs a lcdshadow driver for (with the phandle stuff), so we can be a lot
more strict.

For the acpi version I'm not even sure we can do more than provide the
struct device * pointer of the gpu. I think if we ever get more than 1
lcdshadow driver on acpi systems we can add more stuff later on, for now
I'd just leave that out.

So maybe

acpi_lcdshadow_get(struct device *dev);

of_lcdshadow_get(struct device_node *np);

And with maybe a future plan to add some kind of enum or whatever to
acpi_lcdshadow_get(). Both would return either the lcdshadow pointer, or
an PTR_ERR() so that we could encode EPROBE_DEFER vs ENOENT.

We might also want a low-level lcdshadow_get() which only returns ENOENT
when the driver isn't there, and which leaves "do we really need one?" to
higher levels to answer.

I'd also lean towards putting lcdshadow headers/interfaces into
drivers/gpu, similar to how backlight (well that one is in drivers/video
for historical reasons) with driver implementations all over.

> > This need for an in-kernel source of truth for "which backlight, if
> > any, do I need" is why the backlight stuff never got fixed. It's a
> > really hard problem, and doing the table flip and just letting
> > userspace deal with the mess at least avoids having to face the fact
> > that the kernel is totally failing here. It's made worse for backlight
> > because of 20 years of hacked up systems that "work", and we can't
> > regress any of them.
> 
> Right, as discussed during last plumbers Christian Kellner and I
> have written a plan to slowly resolve this. Unfortunately Christian
> has not found the time to work on this yet.

Yeah, this is hopefully going to be really nice for acpi systems. DT stuff
is hopefully simpler to fix since at least the discovery part is already
all there, just not formalized.

Also given that this has been hurting for years I wouldn't worry about a
few months here and there.
-Daniel

> > I really want to avoid that situation for anything new like lcdshadow.
> 
> Ack.
> 
> > > > Wrt the actual userspace interface, I think the drm core should handle
> > > > this as much as possible. Same way we let drm core handle a lot of the
> > > > atomic property stuff, to make sure things are standardized.
> > > 
> > > Agreed.
> > > 
> > > 
> > > > So
> > > > 
> > > > - add an lcdshadow pointer to struct drm_connector
> > > > - implement the property glue code in drm core completely, at least
> > > > for the read side
> > > > - for the write side we might want to have some drm wrappers drivers
> > > > can call to at the appropriate times to e.g. restore privacy screen
> > > > settings when the panel gets enabled. In case that's needed.
> > > > 
> > > > Also one thing that the EPROBE_DEFER stuff forces us to handle
> > > > correctly is to track these depencies. That's the other nightmare in
> > > > backlight land, essentially you have no idea of knowing (in userspace)
> > > > whether the backlight driver you want is actually loaded, resulting in
> > > > lots of fun. The kernel is the only thing that can know, and for hw
> > > > that's built-in there's really no excuse to not know. So a model where
> > > > stuff gets assembled after drm_dev_register() is imo just plain buggy.
> > > > 
> > > > This means that the lcdshadow subsystem needs to have some idea of
> > > > whether it needs a driver, so that it can correctly differentiate
> > > > between EPROBE_DEFER and ENOENT error cases. In the latter the driver
> > > > should continue loading ofc.
> > > 
> > > Right, so how would the lcdshadow subsystem do this? The only way
> > > would be for it to say try and modprobe thinkpad_acpi from its
> > > core init function (if thinkpad_acpi is enabled).  IOW it is the usual
> > > x86 mess.  I guess we could have something like this in a theoretical
> > > to be added lcdshadow subsystem:
> > > 
> > > /* Add comment explaining why we need this messy stuff here */
> > > const char * const shadow_providers[] = {
> > > #ifdef CONFIG_THINKPAD_ACPI_MODULE
> > >          "thinkpad_acpi",
> > > #endif
> > > #ifdef CONFIG_OTHER_MODULE
> > >          "other",
> > > #endif
> > >          NULL
> > > };
> > > 
> > > int module_init(void)
> > > {
> > >          /* do actual setup of the ?class? */
> > > 
> > >          for (i = 0; shadow_providers[i]; i++)
> > >                  request_module(shadow_providers[i]);
> > > 
> > >          return 0;
> > > }
> > 
> > Hm I think explicitly loading drivers feels very much not device model
> > like. Don't these drivers auto-load, matching on acpi functions?
> 
> thinkpad_acpi does autoload based on a number of ACPI device-ids,
> the idea behind the above request_module is to avoid the need
> to have the acpi-match function you mentioned above.
> 
> Basically what would happen is e.g. :
> 
> 1. i915 loads, calls lcdshadow_get(dev, "eDP-1");
> 2. if this is the first lcdshadow_get() call then
>    the lcdshadow core will do the request_module calls,
>    allowing any of these modules to get loaded + probed and
>    call e.g. lcdshadow_register(&mylcdshadowdev, <gfx-adapter-dev-name>, "eDP-1");
> 3. After the request modules the lcdshadow_get() will walk over
>    the list of registered devices, including the ones just registered
>    by the request_module calls and then hopefully find a match
> 
> So by doing the request-module calls before checking for
> a matching lcdshadow dev, we avoid the need of having some of
> the knowledge currently abstracted away in the thinkpad_acpi driver
> duplicated inside the drm code somewhere.
> 
> > I guess if that doesn't exist, then we'd need to fix that one first :-/
> > In general no request_module please, that shouldn't be needed either.
> > 
> > The trouble with request_module is also that (afaiui) it doesn't
> > really work well with parallel module load and all that, for
> > EPROBE_DEFER to work we do need to be able to answer "should we have a
> > driver?" independently of whether that driver has loaded already or
> > not.
> 
> The idea here is to avoid using EPROBE_DEFER (on x86 at least)
> and either directly return the lcdshadow_dev or ENOENT. Also
> see below.
> 
> > > And then simply have the function which gets a lcd_shadow provider
> > > provide -ENOENT if there are none.
> > > 
> > > One problem here is that this will work for modules since
> > > the drm-core will depend on modules from the lcdshadow-core,
> > > so the lcdshadow-core module will get loaded first and will
> > > then try to load thinkpad_acpi, which will return with -ENODEV
> > > from its module_init on non ThinkPads. We could even put the
> > > request_module loop in some other init_once function so that
> > > things will also work when the lcdshadow-core is builtin.
> > > 
> > > But how do we ensure that thinkpad_acpi will get to register
> > > its lcdshadow before say i915 gets probed if everything is builtin?
> > 
> > EPROBE_DEFER. Everyone hates it, but it does work. It's really kinda
> > neat magic. The only pain point is that you might need to wire
> > EPROBE_DEFER through a few layers in i915, but we do have a lot of
> > that in place already,
> 
> AFAIK we do not have any EPROBE_DEFER handling in i915 in place at
> all! There are _0_ checks for an EPROBE_DEFER return in all of the i915
> code. We actually have a similar problem with backlight control when
> controlled by an external PWM controller such as the PWM controller
> of the Crystal Cove PMIC found on some BYT/CHT drivers or the
> PWM controller found in the LPSS bits of BYT/CHT SoCs.
> 
> Since again we lack a clear hardware model like device tree,
> we use lookup tables for the (external) PWM controllers on x86,
> see the pwm_add_table calls in drivers/acpi/acpi_lpss.c
> and drivers/mfd/intel_soc_pmic_core.c and the pwm_get call
> in the i915 code simply continues on its happy way without
> backlight control if the pwm_get fails, rather then dealing
> with -EPROBE_DEFER. I looked into untangling this back then
> but the i915 code really is not ready to unroll everything
> it has done already once we are probing connectors.
> 
> I actually "fixed" the PWM issue by:
> 
> 1. Adding a module-name field to the PWM lookup table
> registered by the pwm_add_table calls.
> 
> 2. Have the PWM core call request_module on that module_name
> if it finds a match in the registered lookup_table (which
> get registered early on), but the matching pwm_dev has not
> been registered yet.
> 
> So I agree with you that ideally i915 would handle EPROBE_DEFER
> but atm AFAIK it really does not handle that at all and
> we are pretty far away from getting to a point where it
> does handle that.
> 
> Assuming we are going to add some device/model specific
> lcdshadow knowledge inside the lcdshadow core as you
> suggested with your "small acpi match function" above,
> we could do something similar to what the vga_switcheroo
> code is doing for this and have a lcdshadow_defer_probe()
> helper and call that really early in i915_pci_probe(),
> which currently already has this for the vgaswitcheroo case:
> 
>         if (vga_switcheroo_client_probe_defer(pdev))
>                 return -EPROBE_DEFER;
> 
> The reason why I suggested the request_module approach
> is because as said it will allow lcdshadow_get()
> to return either a device or -ENOENT and never -EPROBE_DEFER
> and currently none of the i915 code, nor the nouveau code
> nor the amdgpu code has any EPROBE_DEFER checks. They all
> assume that once there probe function is called they can
> continue on forward without unrolling and exiting from
> probe with EPROBE_DEFER ever. I agree that that is somewhat
> of a bad assumption now a days but fixing that is going to
> be massive undertakig and I'm afraid that trying to deal
> with it will lead to many many regressions.
> 
> So I would rather work around it by using request_module.
> 
> > plus we have solid error-injecting load error
> > tests in igt. So that part shouldn't be a big deal.
> > 
> > > I guess my SOFTDEP solution has the same issue though. Do you
> > > know has this is dealt with for kvmgt ?
> > 
> > kvmgt? What do you mean there? kvmgt is just a user of i915-gem, if
> > you enable it in the config (and module option too iirc) then it works
> > more like an additional uapi layer, similar to how we handle fbdev
> > emulation. So totally different scenario, since the main driver is
> > 100% in control of the situation and controls the register/unregister
> > lifetime cycles completely. There's no indepedent part somewhere else
> > for kvmgt or fbdev emulation.
> > 
> > Or I'm totally misunderstanding what you mean with this example here.
> 
> The i915 driver used to have a:
> 
> MODULE_SOFTDEP("pre: kvmgt")
> 
> But it seems that that has been replaced with some mechanism
> or maybe the need for kvmgt to be loaded first has been removed/
> 
> Regards,
> 
> Hans
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 12:01         ` Daniel Vetter
@ 2020-04-15 13:02           ` Hans de Goede
  2020-04-15 17:54             ` Daniel Vetter
  0 siblings, 1 reply; 54+ messages in thread
From: Hans de Goede @ 2020-04-15 13:02 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Benjamin Berg, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson

Hi,

On 4/15/20 2:01 PM, Daniel Vetter wrote:
> On Wed, Apr 15, 2020 at 01:39:23PM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 4/15/20 12:22 PM, Daniel Vetter wrote:
>>> On Wed, Apr 15, 2020 at 12:11 PM Hans de Goede <hdegoede@redhat.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 4/15/20 11:52 AM, Daniel Vetter wrote:
>>
>> <snip>
>>
>>>>> iv. What every SoC subsystem does:
>>>>>
>>>>> - lcdshadow drivers register drivers
>>>>> - drm drivers look them up
>>>>> - if stuff isn't there yet, we delay loading with EPROBE_DEFER until
>>>>> the entire thing is assembled.
>>>>>
>>>>> That's what we're doing already for other standardized components like
>>>>> drm_bridge or drm_panel, and I think that's also the right approach
>>>>> for backlight and anything else like that. Hand-rolling our own
>>>>> EPROBE_DEFER handling, or some other duct-tape monsters imo just leads
>>>>> to real pain. Also, with EPROBE_DEFER we have one standard way of
>>>>> building a driver from component, which spans subsystems and is also
>>>>> the underlying magic that makes stuff like component.c work.
>>>>
>>>> On the SoCs we have devicetree telling us what components there
>>>> are, so we can wait for them to show up. The only way to figure out
>>>> if the lcdshadow thing is there on a ThinkPad is asking thinkpad_acpi,
>>>> or duplicating a lot of code from thinkpad_acpi. Edit:
>>>> also see below for a possible solution.
>>>
>>> Yup it sucks. I think all we can do is have a small acpi match
>>> function (which yes will duplicate some of the thinkpad_acpi driver
>>> logic) to re-create that information and give us a "should we have a
>>> lcdshadow driver for this $pci_device" answer.
>>
>> Ok, so questions about this solution:
>>
>> 1. Where should that match-function live ?
>>
>> 2. An acpi_thinkpad derived match-function will only be able to
>>     answer if there is an lcdshadow device/driver for the internal
>>     panel. It will not be able to tie this info to a certain PCI
>>     device. My plan is to pass NULL as dev_name when registering
>>     the lcdshadow-device and have lcdshadow_get(dev, <connector-name>)
>>     skip device-name matching (consider everything a match) for
>>     lcdshadow-devices registered with NULL as dev_name.
>>
>>     So I guess in this case the mini match function should just
>>     ignore the passed in device?
> 
> Yeah I think we can't really avoid that. I also expect that we'll need
> ACPI and dt versions of this, and driver needs to know which one to call.
> Since at least in a dt world the driver knows exactly for which dt node it
> needs a lcdshadow driver for (with the phandle stuff), so we can be a lot
> more strict.
> 
> For the acpi version I'm not even sure we can do more than provide the
> struct device * pointer of the gpu. I think if we ever get more than 1
> lcdshadow driver on acpi systems we can add more stuff later on, for now
> I'd just leave that out.
> 
> So maybe
> 
> acpi_lcdshadow_get(struct device *dev);
> 
> of_lcdshadow_get(struct device_node *np);
> 
> And with maybe a future plan to add some kind of enum or whatever to
> acpi_lcdshadow_get(). Both would return either the lcdshadow pointer, or
> an PTR_ERR() so that we could encode EPROBE_DEFER vs ENOENT.

Ok, note I plan to only implement the acpi version for now, I do
expect some non ACPI/x86 devices to show up with his feature
eventually but I believe it is best to implement this once
those actually show up. Esp. since this will also involve adding
some devicetree bindings for this.

> We might also want a low-level lcdshadow_get() which only returns ENOENT
> when the driver isn't there, and which leaves "do we really need one?" to
> higher levels to answer.

Right, so my latest idea on that is indeed a high-level lcdshadow_get()
which takes a struct device * and a connector-name and which never
returns EPROBE_DEFER.

As for leaving things to the higher levels to answer, as explained
in my other follow-up email I think that we should probably add a
lcdshadow_probe_defer() helper for this and call that early on
in the PCI-driver probe functions for the 3 major x86 GPU drivers.
Does that sound ok to you?

> I'd also lean towards putting lcdshadow headers/interfaces into
> drivers/gpu,

Ack, I think we should even make this drm specific and prefix it with
drm_ so that we get drm_lcdshadow_foo as functions, just to make
clear that this is maintained together with the other drm bits.

But my question about "where should this live" was mainly about
the light weight match helpers you suggested to use to figure out
if the device supports lcdshadow at all (and we thus should wait
for a driver) or not. E.g. I can see us adding a:

drivers/gpu/drm/drm_lcdshadow.c

file for the core bits and then maybe a:

drivers/gpu/drm/drm_lcdshadow_detect.c

file with the light weight match helpers, with each helper
wrapped in #if IS_ENABLED(CONFIG_THINKPAD_ACPI), etc. ?

> with driver implementations all over.

Ack.

Regards,

Hans

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15  9:42 RFC: Drm-connector properties managed by another driver / privacy screen support Hans de Goede
  2020-04-15  9:52 ` Daniel Vetter
@ 2020-04-15 15:28 ` Jani Nikula
  2020-04-15 15:40   ` Hans de Goede
  1 sibling, 1 reply; 54+ messages in thread
From: Jani Nikula @ 2020-04-15 15:28 UTC (permalink / raw)
  To: Hans de Goede, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Daniel Vetter
  Cc: Benjamin Berg, Christian Kellner, Javier Martinez Canillas,
	dri-devel, Nitin Joshi1, Rajat Jain, Mark Pearson

On Wed, 15 Apr 2020, Hans de Goede <hdegoede@redhat.com> wrote:
> ii. Currently the "privacy-screen" property added by Rajat's
> patch-set is an enum with 2 possible values:
> "Enabled"
> "Disabled"
>
> We could add a third value "Not Available", which would be the
> default and then for internal panels always add the property
> so that we avoid the problem that detecting if the laptop has
> an internal privacy screen needs to be done before the connector
> is registered. Then we can add some hooks which allow an
> lcdshadow-driver to register itself against a connector later
> (which is non trivial wrt probe order, but lets ignore that for now).

I regret dropping the ball on Rajat's series (sorry!).

I do think having the connector property for this is the way to go. Even
if we couldn't necessarily figure out all the details on the kernel
internal connections, can we settle on the property though, so we could
move forward with Rajat's series?

Moreover, do we actually need two properties, one which could indicate
userspace's desire for the property, and another that tells the hardware
state? I'd so very much like to have no in-kernel/in-firmware shortcuts
to enable/disable the privacy screen, and instead have any hardware
buttons just be events that the userspace could react to. However I
don't think that'll be the case unfortunately.


BR,
Jani.


-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 15:28 ` Jani Nikula
@ 2020-04-15 15:40   ` Hans de Goede
  2020-04-15 17:14     ` [External] " Mark Pearson
                       ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Hans de Goede @ 2020-04-15 15:40 UTC (permalink / raw)
  To: Jani Nikula, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, Daniel Vetter
  Cc: Benjamin Berg, Christian Kellner, Javier Martinez Canillas,
	dri-devel, Nitin Joshi1, Rajat Jain, Mark Pearson

Hi,

On 4/15/20 5:28 PM, Jani Nikula wrote:
> On Wed, 15 Apr 2020, Hans de Goede <hdegoede@redhat.com> wrote:
>> ii. Currently the "privacy-screen" property added by Rajat's
>> patch-set is an enum with 2 possible values:
>> "Enabled"
>> "Disabled"
>>
>> We could add a third value "Not Available", which would be the
>> default and then for internal panels always add the property
>> so that we avoid the problem that detecting if the laptop has
>> an internal privacy screen needs to be done before the connector
>> is registered. Then we can add some hooks which allow an
>> lcdshadow-driver to register itself against a connector later
>> (which is non trivial wrt probe order, but lets ignore that for now).
> 
> I regret dropping the ball on Rajat's series (sorry!).
> 
> I do think having the connector property for this is the way to go.

I 100% agree.

> Even
> if we couldn't necessarily figure out all the details on the kernel
> internal connections, can we settle on the property though, so we could
> move forward with Rajat's series?

Yes please, this will also allow us to move forward with userspace
support even if for testing that we do some hacks for the kernel's
internal connections for now.

> Moreover, do we actually need two properties, one which could indicate
> userspace's desire for the property, and another that tells the hardware
> state?

No I do not think so. I would expect there to just be one property,
I guess that if the state is (partly) firmware controlled then there
might be a race, but we will need a notification mechanism (*) for
firmware triggered state changes anyways, so shortly after loosing
the race userspace will process the notification and it will know
about it.

One thing which might be useful is a way to signal that the property
is read-only in case we ever hit hw where that is the case.

> I'd so very much like to have no in-kernel/in-firmware shortcuts
> to enable/disable the privacy screen, and instead have any hardware
> buttons just be events that the userspace could react to. However I
> don't think that'll be the case unfortunately.

In my experience with keyboard-backlight support, we will (unfortunately)
see a mix and in some case we will get a notification that the firmware
has adjusted the state, rather then just getting a keypress and
dealing with that ourselves.  In some cases we may even be able to
choose, so the fw will deal with it by default but we can ask it
to just send a key-press.  But I do believe that we can *not* expect
that we will always just get a keypress for userspace to deal with.

Regards,

Hans


*) Some udev event I guess, I sorta assume there already is a
notification mechanism for property change notifications ?


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [External] Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 15:40   ` Hans de Goede
@ 2020-04-15 17:14     ` Mark Pearson
  2020-04-15 18:06       ` Hans de Goede
  2020-04-15 19:20     ` Rajat Jain
  2020-04-17  9:02     ` Pekka Paalanen
  2 siblings, 1 reply; 54+ messages in thread
From: Mark Pearson @ 2020-04-15 17:14 UTC (permalink / raw)
  To: Hans de Goede, Jani Nikula, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Daniel Vetter
  Cc: Benjamin Berg, Christian Kellner, Javier Martinez Canillas,
	dri-devel, Rajat Jain, Nitin Joshi1

Hi,

> -----Original Message-----
> From: Hans de Goede <hdegoede@redhat.com>
> Sent: Wednesday, April 15, 2020 11:41 AM
> On 4/15/20 5:28 PM, Jani Nikula wrote:
> > On Wed, 15 Apr 2020, Hans de Goede <hdegoede@redhat.com> wrote:
> > Moreover, do we actually need two properties, one which could indicate
> > userspace's desire for the property, and another that tells the hardware
> > state?
> 
> No I do not think so. I would expect there to just be one property,
> I guess that if the state is (partly) firmware controlled then there
> might be a race, but we will need a notification mechanism (*) for
> firmware triggered state changes anyways, so shortly after loosing
> the race userspace will process the notification and it will know
> about it.
> 
> One thing which might be useful is a way to signal that the property
> is read-only in case we ever hit hw where that is the case.
> 
> > I'd so very much like to have no in-kernel/in-firmware shortcuts
> > to enable/disable the privacy screen, and instead have any hardware
> > buttons just be events that the userspace could react to. However I
> > don't think that'll be the case unfortunately.
> 
> In my experience with keyboard-backlight support, we will (unfortunately)
> see a mix and in some case we will get a notification that the firmware
> has adjusted the state, rather then just getting a keypress and
> dealing with that ourselves.  In some cases we may even be able to
> choose, so the fw will deal with it by default but we can ask it
> to just send a key-press.  But I do believe that we can *not* expect
> that we will always just get a keypress for userspace to deal with.
> 
Afraid, the "hotkeys" control for ePrivacy (Fn+D I believe) is very unlikely 
to change - Windows uses it as well...
We can do notification of any hotkey presses to update the DRM layer (and
userspace) if that helps

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 13:02           ` Hans de Goede
@ 2020-04-15 17:54             ` Daniel Vetter
  2020-04-15 18:19               ` Hans de Goede
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel Vetter @ 2020-04-15 17:54 UTC (permalink / raw)
  To: Hans de Goede
  Cc: David Airlie, Christian Kellner, Javier Martinez Canillas,
	dri-devel, Thomas Zimmermann, Nitin Joshi1, Rajat Jain,
	Mark Pearson, Benjamin Berg

On Wed, Apr 15, 2020 at 03:02:53PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 4/15/20 2:01 PM, Daniel Vetter wrote:
> > On Wed, Apr 15, 2020 at 01:39:23PM +0200, Hans de Goede wrote:
> > > Hi,
> > > 
> > > On 4/15/20 12:22 PM, Daniel Vetter wrote:
> > > > On Wed, Apr 15, 2020 at 12:11 PM Hans de Goede <hdegoede@redhat.com> wrote:
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > On 4/15/20 11:52 AM, Daniel Vetter wrote:
> > > 
> > > <snip>
> > > 
> > > > > > iv. What every SoC subsystem does:
> > > > > > 
> > > > > > - lcdshadow drivers register drivers
> > > > > > - drm drivers look them up
> > > > > > - if stuff isn't there yet, we delay loading with EPROBE_DEFER until
> > > > > > the entire thing is assembled.
> > > > > > 
> > > > > > That's what we're doing already for other standardized components like
> > > > > > drm_bridge or drm_panel, and I think that's also the right approach
> > > > > > for backlight and anything else like that. Hand-rolling our own
> > > > > > EPROBE_DEFER handling, or some other duct-tape monsters imo just leads
> > > > > > to real pain. Also, with EPROBE_DEFER we have one standard way of
> > > > > > building a driver from component, which spans subsystems and is also
> > > > > > the underlying magic that makes stuff like component.c work.
> > > > > 
> > > > > On the SoCs we have devicetree telling us what components there
> > > > > are, so we can wait for them to show up. The only way to figure out
> > > > > if the lcdshadow thing is there on a ThinkPad is asking thinkpad_acpi,
> > > > > or duplicating a lot of code from thinkpad_acpi. Edit:
> > > > > also see below for a possible solution.
> > > > 
> > > > Yup it sucks. I think all we can do is have a small acpi match
> > > > function (which yes will duplicate some of the thinkpad_acpi driver
> > > > logic) to re-create that information and give us a "should we have a
> > > > lcdshadow driver for this $pci_device" answer.
> > > 
> > > Ok, so questions about this solution:
> > > 
> > > 1. Where should that match-function live ?
> > > 
> > > 2. An acpi_thinkpad derived match-function will only be able to
> > >     answer if there is an lcdshadow device/driver for the internal
> > >     panel. It will not be able to tie this info to a certain PCI
> > >     device. My plan is to pass NULL as dev_name when registering
> > >     the lcdshadow-device and have lcdshadow_get(dev, <connector-name>)
> > >     skip device-name matching (consider everything a match) for
> > >     lcdshadow-devices registered with NULL as dev_name.
> > > 
> > >     So I guess in this case the mini match function should just
> > >     ignore the passed in device?
> > 
> > Yeah I think we can't really avoid that. I also expect that we'll need
> > ACPI and dt versions of this, and driver needs to know which one to call.
> > Since at least in a dt world the driver knows exactly for which dt node it
> > needs a lcdshadow driver for (with the phandle stuff), so we can be a lot
> > more strict.
> > 
> > For the acpi version I'm not even sure we can do more than provide the
> > struct device * pointer of the gpu. I think if we ever get more than 1
> > lcdshadow driver on acpi systems we can add more stuff later on, for now
> > I'd just leave that out.
> > 
> > So maybe
> > 
> > acpi_lcdshadow_get(struct device *dev);
> > 
> > of_lcdshadow_get(struct device_node *np);
> > 
> > And with maybe a future plan to add some kind of enum or whatever to
> > acpi_lcdshadow_get(). Both would return either the lcdshadow pointer, or
> > an PTR_ERR() so that we could encode EPROBE_DEFER vs ENOENT.
> 
> Ok, note I plan to only implement the acpi version for now, I do
> expect some non ACPI/x86 devices to show up with his feature
> eventually but I believe it is best to implement this once
> those actually show up. Esp. since this will also involve adding
> some devicetree bindings for this.

ofc, just wanted to lay out the entire thing. The DT version needs some
good bikeshed on the dt schema first anyway (so that the helper can decode
that directly).

> > We might also want a low-level lcdshadow_get() which only returns ENOENT
> > when the driver isn't there, and which leaves "do we really need one?" to
> > higher levels to answer.
> 
> Right, so my latest idea on that is indeed a high-level lcdshadow_get()
> which takes a struct device * and a connector-name and which never
> returns EPROBE_DEFER.
> 
> As for leaving things to the higher levels to answer, as explained
> in my other follow-up email I think that we should probably add a
> lcdshadow_probe_defer() helper for this and call that early on
> in the PCI-driver probe functions for the 3 major x86 GPU drivers.
> Does that sound ok to you?

Uh ... not pretty. There's still a lifetime problem that strictly speaking
there's nothing stopping the other driver from getting unloaded between
your _probe_defer and the subsequent _get. I think fixing this properly
(and screaming a bit at the error code, oh well) is better.

> > I'd also lean towards putting lcdshadow headers/interfaces into
> > drivers/gpu,
> 
> Ack, I think we should even make this drm specific and prefix it with
> drm_ so that we get drm_lcdshadow_foo as functions, just to make
> clear that this is maintained together with the other drm bits.

I want to avoid the impression that I'm on an evil plan to take over the
entire world, but personally very happy with a drm_ prefix for this.

> But my question about "where should this live" was mainly about
> the light weight match helpers you suggested to use to figure out
> if the device supports lcdshadow at all (and we thus should wait
> for a driver) or not. E.g. I can see us adding a:
> 
> drivers/gpu/drm/drm_lcdshadow.c
> 
> file for the core bits and then maybe a:
> 
> drivers/gpu/drm/drm_lcdshadow_detect.c
> 
> file with the light weight match helpers, with each helper
> wrapped in #if IS_ENABLED(CONFIG_THINKPAD_ACPI), etc. ?

I'd expect it's all going to be so tiny that separate file wont make much
sense. Thus far we're simply adding the 1-2 of_ helpers to the
corresponding file, with an #ifdef CONFIG_OF around them. It's ok enough.

> > with driver implementations all over.
> 
> Ack.
> 
> Regards,
> 
> Hans
> 

Cheers, Daniel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [External] Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 17:14     ` [External] " Mark Pearson
@ 2020-04-15 18:06       ` Hans de Goede
  0 siblings, 0 replies; 54+ messages in thread
From: Hans de Goede @ 2020-04-15 18:06 UTC (permalink / raw)
  To: Mark Pearson, Jani Nikula, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Daniel Vetter
  Cc: Benjamin Berg, Christian Kellner, Javier Martinez Canillas,
	dri-devel, Rajat Jain, Nitin Joshi1

Hi Mark,

On 4/15/20 7:14 PM, Mark Pearson wrote:
> Hi,
> 
>> -----Original Message-----
>> From: Hans de Goede <hdegoede@redhat.com>
>> Sent: Wednesday, April 15, 2020 11:41 AM
>> On 4/15/20 5:28 PM, Jani Nikula wrote:
>>> On Wed, 15 Apr 2020, Hans de Goede <hdegoede@redhat.com> wrote:
>>> Moreover, do we actually need two properties, one which could indicate
>>> userspace's desire for the property, and another that tells the hardware
>>> state?
>>
>> No I do not think so. I would expect there to just be one property,
>> I guess that if the state is (partly) firmware controlled then there
>> might be a race, but we will need a notification mechanism (*) for
>> firmware triggered state changes anyways, so shortly after loosing
>> the race userspace will process the notification and it will know
>> about it.
>>
>> One thing which might be useful is a way to signal that the property
>> is read-only in case we ever hit hw where that is the case.
>>
>>> I'd so very much like to have no in-kernel/in-firmware shortcuts
>>> to enable/disable the privacy screen, and instead have any hardware
>>> buttons just be events that the userspace could react to. However I
>>> don't think that'll be the case unfortunately.
>>
>> In my experience with keyboard-backlight support, we will (unfortunately)
>> see a mix and in some case we will get a notification that the firmware
>> has adjusted the state, rather then just getting a keypress and
>> dealing with that ourselves.  In some cases we may even be able to
>> choose, so the fw will deal with it by default but we can ask it
>> to just send a key-press.  But I do believe that we can *not* expect
>> that we will always just get a keypress for userspace to deal with.
>>
> Afraid, the "hotkeys" control for ePrivacy (Fn+D I believe) is very unlikely
> to change - Windows uses it as well...
> We can do notification of any hotkey presses to update the DRM layer (and
> userspace) if that helps

We are not asking for changing the hotkey, what we would like is
for the hotkey to only send a notification that it was pressed
and for it to not actually do anything with the ePrivacy screen
state.  This does not need to be it defaults behavior, but we would
like to be able to ask the firmware to not act on it itself,
just like we can already disable the firmware/embedded controller
responding to e.g. brightness up/down key presses itself.

Regards,

Hans

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 17:54             ` Daniel Vetter
@ 2020-04-15 18:19               ` Hans de Goede
  2020-04-15 18:29                 ` Daniel Vetter
  0 siblings, 1 reply; 54+ messages in thread
From: Hans de Goede @ 2020-04-15 18:19 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Benjamin Berg, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson

Hi,

On 4/15/20 7:54 PM, Daniel Vetter wrote:
> On Wed, Apr 15, 2020 at 03:02:53PM +0200, Hans de Goede wrote:
>> Hi,
>>
>> On 4/15/20 2:01 PM, Daniel Vetter wrote:
>>> On Wed, Apr 15, 2020 at 01:39:23PM +0200, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 4/15/20 12:22 PM, Daniel Vetter wrote:
>>>>> On Wed, Apr 15, 2020 at 12:11 PM Hans de Goede <hdegoede@redhat.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> On 4/15/20 11:52 AM, Daniel Vetter wrote:
>>>>
>>>> <snip>
>>>>
>>>>>>> iv. What every SoC subsystem does:
>>>>>>>
>>>>>>> - lcdshadow drivers register drivers
>>>>>>> - drm drivers look them up
>>>>>>> - if stuff isn't there yet, we delay loading with EPROBE_DEFER until
>>>>>>> the entire thing is assembled.
>>>>>>>
>>>>>>> That's what we're doing already for other standardized components like
>>>>>>> drm_bridge or drm_panel, and I think that's also the right approach
>>>>>>> for backlight and anything else like that. Hand-rolling our own
>>>>>>> EPROBE_DEFER handling, or some other duct-tape monsters imo just leads
>>>>>>> to real pain. Also, with EPROBE_DEFER we have one standard way of
>>>>>>> building a driver from component, which spans subsystems and is also
>>>>>>> the underlying magic that makes stuff like component.c work.
>>>>>>
>>>>>> On the SoCs we have devicetree telling us what components there
>>>>>> are, so we can wait for them to show up. The only way to figure out
>>>>>> if the lcdshadow thing is there on a ThinkPad is asking thinkpad_acpi,
>>>>>> or duplicating a lot of code from thinkpad_acpi. Edit:
>>>>>> also see below for a possible solution.
>>>>>
>>>>> Yup it sucks. I think all we can do is have a small acpi match
>>>>> function (which yes will duplicate some of the thinkpad_acpi driver
>>>>> logic) to re-create that information and give us a "should we have a
>>>>> lcdshadow driver for this $pci_device" answer.
>>>>
>>>> Ok, so questions about this solution:
>>>>
>>>> 1. Where should that match-function live ?
>>>>
>>>> 2. An acpi_thinkpad derived match-function will only be able to
>>>>      answer if there is an lcdshadow device/driver for the internal
>>>>      panel. It will not be able to tie this info to a certain PCI
>>>>      device. My plan is to pass NULL as dev_name when registering
>>>>      the lcdshadow-device and have lcdshadow_get(dev, <connector-name>)
>>>>      skip device-name matching (consider everything a match) for
>>>>      lcdshadow-devices registered with NULL as dev_name.
>>>>
>>>>      So I guess in this case the mini match function should just
>>>>      ignore the passed in device?
>>>
>>> Yeah I think we can't really avoid that. I also expect that we'll need
>>> ACPI and dt versions of this, and driver needs to know which one to call.
>>> Since at least in a dt world the driver knows exactly for which dt node it
>>> needs a lcdshadow driver for (with the phandle stuff), so we can be a lot
>>> more strict.
>>>
>>> For the acpi version I'm not even sure we can do more than provide the
>>> struct device * pointer of the gpu. I think if we ever get more than 1
>>> lcdshadow driver on acpi systems we can add more stuff later on, for now
>>> I'd just leave that out.
>>>
>>> So maybe
>>>
>>> acpi_lcdshadow_get(struct device *dev);
>>>
>>> of_lcdshadow_get(struct device_node *np);
>>>
>>> And with maybe a future plan to add some kind of enum or whatever to
>>> acpi_lcdshadow_get(). Both would return either the lcdshadow pointer, or
>>> an PTR_ERR() so that we could encode EPROBE_DEFER vs ENOENT.
>>
>> Ok, note I plan to only implement the acpi version for now, I do
>> expect some non ACPI/x86 devices to show up with his feature
>> eventually but I believe it is best to implement this once
>> those actually show up. Esp. since this will also involve adding
>> some devicetree bindings for this.
> 
> ofc, just wanted to lay out the entire thing. The DT version needs some
> good bikeshed on the dt schema first anyway (so that the helper can decode
> that directly).
> 
>>> We might also want a low-level lcdshadow_get() which only returns ENOENT
>>> when the driver isn't there, and which leaves "do we really need one?" to
>>> higher levels to answer.
>>
>> Right, so my latest idea on that is indeed a high-level lcdshadow_get()
>> which takes a struct device * and a connector-name and which never
>> returns EPROBE_DEFER.
>>
>> As for leaving things to the higher levels to answer, as explained
>> in my other follow-up email I think that we should probably add a
>> lcdshadow_probe_defer() helper for this and call that early on
>> in the PCI-driver probe functions for the 3 major x86 GPU drivers.
>> Does that sound ok to you?
> 
> Uh ... not pretty. There's still a lifetime problem that strictly speaking
> there's nothing stopping the other driver from getting unloaded between
> your _probe_defer and the subsequent _get. I think fixing this properly
> (and screaming a bit at the error code, oh well) is better.

I would really like to separate the discussion and the work
on getting the 3 major x86 GPU drivers ready to deal with EPROBE_DEFER
from the lcdshadow discussion and work.  I expect getting these
3 drivers ready for EPROBE_DEFER is going to be a major undertaking
and I would like avoid introducing this significant scope creep
to the lcdshadow discussion, because it simply is a too big undertaking
to undertake without us getting a significant amount of manpower
specifically for this from somewhere.

Note I do agree with you that getting these 3 drivers ready
for EPROBE_DEFER handling is a worthwhile undertaking, but
it simply will take too much extra time and as such IMHO it
really is out of scope for the lcdshadow stuff.
I expect the amount of work (esp. also dealing with testing
and regressions) for the EPROBE_DEFER project by itself
to be a lot *more* work then the actual lcdshadow project.

So going with the assumption/decision that adding proper
EPROBE_DEFER handling to these 3 drivers is out of scope,
I believe that adding a lcdshadow_probe_defer() helper is
an ok solution/workaround for now.

As for your case of the other driver getting unloaded in between
the check and use of it, that can only happen by explicit user
action and in that case the worst thing what will happen
is the "privacy-screen" property missing from the connector,
which in that case is more or less exactly what the user
asked for.

>>> I'd also lean towards putting lcdshadow headers/interfaces into
>>> drivers/gpu,
>>
>> Ack, I think we should even make this drm specific and prefix it with
>> drm_ so that we get drm_lcdshadow_foo as functions, just to make
>> clear that this is maintained together with the other drm bits.
> 
> I want to avoid the impression that I'm on an evil plan to take over the
> entire world, but personally very happy with a drm_ prefix for this.

Hehe, ok.

>> But my question about "where should this live" was mainly about
>> the light weight match helpers you suggested to use to figure out
>> if the device supports lcdshadow at all (and we thus should wait
>> for a driver) or not. E.g. I can see us adding a:
>>
>> drivers/gpu/drm/drm_lcdshadow.c
>>
>> file for the core bits and then maybe a:
>>
>> drivers/gpu/drm/drm_lcdshadow_detect.c
>>
>> file with the light weight match helpers, with each helper
>> wrapped in #if IS_ENABLED(CONFIG_THINKPAD_ACPI), etc. ?
> 
> I'd expect it's all going to be so tiny that separate file wont make much
> sense. Thus far we're simply adding the 1-2 of_ helpers to the
> corresponding file, with an #ifdef CONFIG_OF around them. It's ok enough.

Ok, lets start with one file for now, we can always split it later.

Regards,

Hans

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 18:19               ` Hans de Goede
@ 2020-04-15 18:29                 ` Daniel Vetter
  2020-04-15 19:50                   ` Hans de Goede
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel Vetter @ 2020-04-15 18:29 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Benjamin Berg, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson

On Wed, Apr 15, 2020 at 8:19 PM Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi,
>
> On 4/15/20 7:54 PM, Daniel Vetter wrote:
> > On Wed, Apr 15, 2020 at 03:02:53PM +0200, Hans de Goede wrote:
> >> Hi,
> >>
> >> On 4/15/20 2:01 PM, Daniel Vetter wrote:
> >>> On Wed, Apr 15, 2020 at 01:39:23PM +0200, Hans de Goede wrote:
> >>>> Hi,
> >>>>
> >>>> On 4/15/20 12:22 PM, Daniel Vetter wrote:
> >>>>> On Wed, Apr 15, 2020 at 12:11 PM Hans de Goede <hdegoede@redhat.com> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> On 4/15/20 11:52 AM, Daniel Vetter wrote:
> >>>>
> >>>> <snip>
> >>>>
> >>>>>>> iv. What every SoC subsystem does:
> >>>>>>>
> >>>>>>> - lcdshadow drivers register drivers
> >>>>>>> - drm drivers look them up
> >>>>>>> - if stuff isn't there yet, we delay loading with EPROBE_DEFER until
> >>>>>>> the entire thing is assembled.
> >>>>>>>
> >>>>>>> That's what we're doing already for other standardized components like
> >>>>>>> drm_bridge or drm_panel, and I think that's also the right approach
> >>>>>>> for backlight and anything else like that. Hand-rolling our own
> >>>>>>> EPROBE_DEFER handling, or some other duct-tape monsters imo just leads
> >>>>>>> to real pain. Also, with EPROBE_DEFER we have one standard way of
> >>>>>>> building a driver from component, which spans subsystems and is also
> >>>>>>> the underlying magic that makes stuff like component.c work.
> >>>>>>
> >>>>>> On the SoCs we have devicetree telling us what components there
> >>>>>> are, so we can wait for them to show up. The only way to figure out
> >>>>>> if the lcdshadow thing is there on a ThinkPad is asking thinkpad_acpi,
> >>>>>> or duplicating a lot of code from thinkpad_acpi. Edit:
> >>>>>> also see below for a possible solution.
> >>>>>
> >>>>> Yup it sucks. I think all we can do is have a small acpi match
> >>>>> function (which yes will duplicate some of the thinkpad_acpi driver
> >>>>> logic) to re-create that information and give us a "should we have a
> >>>>> lcdshadow driver for this $pci_device" answer.
> >>>>
> >>>> Ok, so questions about this solution:
> >>>>
> >>>> 1. Where should that match-function live ?
> >>>>
> >>>> 2. An acpi_thinkpad derived match-function will only be able to
> >>>>      answer if there is an lcdshadow device/driver for the internal
> >>>>      panel. It will not be able to tie this info to a certain PCI
> >>>>      device. My plan is to pass NULL as dev_name when registering
> >>>>      the lcdshadow-device and have lcdshadow_get(dev, <connector-name>)
> >>>>      skip device-name matching (consider everything a match) for
> >>>>      lcdshadow-devices registered with NULL as dev_name.
> >>>>
> >>>>      So I guess in this case the mini match function should just
> >>>>      ignore the passed in device?
> >>>
> >>> Yeah I think we can't really avoid that. I also expect that we'll need
> >>> ACPI and dt versions of this, and driver needs to know which one to call.
> >>> Since at least in a dt world the driver knows exactly for which dt node it
> >>> needs a lcdshadow driver for (with the phandle stuff), so we can be a lot
> >>> more strict.
> >>>
> >>> For the acpi version I'm not even sure we can do more than provide the
> >>> struct device * pointer of the gpu. I think if we ever get more than 1
> >>> lcdshadow driver on acpi systems we can add more stuff later on, for now
> >>> I'd just leave that out.
> >>>
> >>> So maybe
> >>>
> >>> acpi_lcdshadow_get(struct device *dev);
> >>>
> >>> of_lcdshadow_get(struct device_node *np);
> >>>
> >>> And with maybe a future plan to add some kind of enum or whatever to
> >>> acpi_lcdshadow_get(). Both would return either the lcdshadow pointer, or
> >>> an PTR_ERR() so that we could encode EPROBE_DEFER vs ENOENT.
> >>
> >> Ok, note I plan to only implement the acpi version for now, I do
> >> expect some non ACPI/x86 devices to show up with his feature
> >> eventually but I believe it is best to implement this once
> >> those actually show up. Esp. since this will also involve adding
> >> some devicetree bindings for this.
> >
> > ofc, just wanted to lay out the entire thing. The DT version needs some
> > good bikeshed on the dt schema first anyway (so that the helper can decode
> > that directly).
> >
> >>> We might also want a low-level lcdshadow_get() which only returns ENOENT
> >>> when the driver isn't there, and which leaves "do we really need one?" to
> >>> higher levels to answer.
> >>
> >> Right, so my latest idea on that is indeed a high-level lcdshadow_get()
> >> which takes a struct device * and a connector-name and which never
> >> returns EPROBE_DEFER.
> >>
> >> As for leaving things to the higher levels to answer, as explained
> >> in my other follow-up email I think that we should probably add a
> >> lcdshadow_probe_defer() helper for this and call that early on
> >> in the PCI-driver probe functions for the 3 major x86 GPU drivers.
> >> Does that sound ok to you?
> >
> > Uh ... not pretty. There's still a lifetime problem that strictly speaking
> > there's nothing stopping the other driver from getting unloaded between
> > your _probe_defer and the subsequent _get. I think fixing this properly
> > (and screaming a bit at the error code, oh well) is better.
>
> I would really like to separate the discussion and the work
> on getting the 3 major x86 GPU drivers ready to deal with EPROBE_DEFER
> from the lcdshadow discussion and work.  I expect getting these
> 3 drivers ready for EPROBE_DEFER is going to be a major undertaking
> and I would like avoid introducing this significant scope creep
> to the lcdshadow discussion, because it simply is a too big undertaking
> to undertake without us getting a significant amount of manpower
> specifically for this from somewhere.
>
> Note I do agree with you that getting these 3 drivers ready
> for EPROBE_DEFER handling is a worthwhile undertaking, but
> it simply will take too much extra time and as such IMHO it
> really is out of scope for the lcdshadow stuff.
> I expect the amount of work (esp. also dealing with testing
> and regressions) for the EPROBE_DEFER project by itself
> to be a lot *more* work then the actual lcdshadow project.
>
> So going with the assumption/decision that adding proper
> EPROBE_DEFER handling to these 3 drivers is out of scope,
> I believe that adding a lcdshadow_probe_defer() helper is
> an ok solution/workaround for now.
>
> As for your case of the other driver getting unloaded in between
> the check and use of it, that can only happen by explicit user
> action and in that case the worst thing what will happen
> is the "privacy-screen" property missing from the connector,
> which in that case is more or less exactly what the user
> asked for.

For i915 we've had patches, for mei-hdcp integration. Until it became
clear that the mei subsystem is bonkers, and handles suspend/resume by
unloading! drivers.

Which forced i915 to unload and broke the world :-/

So at least for i915 the work should be done already, and just amount
to updating the patches Ram already had. No ideas about the other 2.

What I don't really want is an interface with races. So if fixing the
other drivers is too much work, what we could do is roughly:

- in the top-level probe function to an acpi_lcdshadow_get. This might
fail with EPROBE_DEFER.
- this gives us a dangling reference, but we can drop that one when we
exit the top-level probe function (both on success and on error cases)
- the 2nd acpi_lcdshadow_get call deep down should always succeed at
that point (since the top level holds a reference), so you could wrap
that in a WARNING(IS_ERR_PTR()). Ok that's a lie, if we concurrently
unload then the 2nd call still fails (the reference can never prevent
a hotunbind in the linux kernel device model), so this is exactly as
buggy as your version, so still needs a comment about that.

Adding a acpi_lcdshadow_probe_defer() function otoh just gives people
the impression that that's actually a correct way of doing this.

Then it's up to the driver maintainers whether they're ok with the
above hack of a fake reference, or not. As I said, I think for i915 it
should be fairly ok to just roll it out, but maybe the patches don't
apply at all anymore.

btw to make everything work you also need to set up a device_link
between the lcdshadow device (and it's driver, that's the more
important thing) and the gpu device. That device link will make sure
that
- suspend/resume is done in the right order
- driver load/unload is done in the right order, i.e. unloading of the
lcdshadow driver will automatically force an unbind of the gpu driver
first.

With that all the get/put will do is just refcount the final kfree of
the driver structure.

Cheers, Daniel

> >>> I'd also lean towards putting lcdshadow headers/interfaces into
> >>> drivers/gpu,
> >>
> >> Ack, I think we should even make this drm specific and prefix it with
> >> drm_ so that we get drm_lcdshadow_foo as functions, just to make
> >> clear that this is maintained together with the other drm bits.
> >
> > I want to avoid the impression that I'm on an evil plan to take over the
> > entire world, but personally very happy with a drm_ prefix for this.
>
> Hehe, ok.
>
> >> But my question about "where should this live" was mainly about
> >> the light weight match helpers you suggested to use to figure out
> >> if the device supports lcdshadow at all (and we thus should wait
> >> for a driver) or not. E.g. I can see us adding a:
> >>
> >> drivers/gpu/drm/drm_lcdshadow.c
> >>
> >> file for the core bits and then maybe a:
> >>
> >> drivers/gpu/drm/drm_lcdshadow_detect.c
> >>
> >> file with the light weight match helpers, with each helper
> >> wrapped in #if IS_ENABLED(CONFIG_THINKPAD_ACPI), etc. ?
> >
> > I'd expect it's all going to be so tiny that separate file wont make much
> > sense. Thus far we're simply adding the 1-2 of_ helpers to the
> > corresponding file, with an #ifdef CONFIG_OF around them. It's ok enough.
>
> Ok, lets start with one file for now, we can always split it later.
>
> Regards,
>
> Hans
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 15:40   ` Hans de Goede
  2020-04-15 17:14     ` [External] " Mark Pearson
@ 2020-04-15 19:20     ` Rajat Jain
  2020-04-15 21:10       ` Jani Nikula
  2020-04-17  9:02     ` Pekka Paalanen
  2 siblings, 1 reply; 54+ messages in thread
From: Rajat Jain @ 2020-04-15 19:20 UTC (permalink / raw)
  To: Hans de Goede
  Cc: David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Thomas Zimmermann, Nitin Joshi1,
	Mark Pearson, Benjamin Berg

Hello,

On Wed, Apr 15, 2020 at 8:40 AM Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi,
>
> On 4/15/20 5:28 PM, Jani Nikula wrote:
> > On Wed, 15 Apr 2020, Hans de Goede <hdegoede@redhat.com> wrote:
> >> ii. Currently the "privacy-screen" property added by Rajat's
> >> patch-set is an enum with 2 possible values:
> >> "Enabled"
> >> "Disabled"
> >>
> >> We could add a third value "Not Available", which would be the
> >> default and then for internal panels always add the property
> >> so that we avoid the problem that detecting if the laptop has
> >> an internal privacy screen needs to be done before the connector
> >> is registered. Then we can add some hooks which allow an
> >> lcdshadow-driver to register itself against a connector later
> >> (which is non trivial wrt probe order, but lets ignore that for now).
> >
> > I regret dropping the ball on Rajat's series (sorry!).
> >
> > I do think having the connector property for this is the way to go.
>
> I 100% agree.
>
> > Even
> > if we couldn't necessarily figure out all the details on the kernel
> > internal connections, can we settle on the property though, so we could
> > move forward with Rajat's series?

Thanks, it would be great!.

>
> Yes please, this will also allow us to move forward with userspace
> support even if for testing that we do some hacks for the kernel's
> internal connections for now.
>
> > Moreover, do we actually need two properties, one which could indicate
> > userspace's desire for the property, and another that tells the hardware
> > state?
>
> No I do not think so. I would expect there to just be one property,
> I guess that if the state is (partly) firmware controlled then there
> might be a race, but we will need a notification mechanism (*) for
> firmware triggered state changes anyways, so shortly after loosing
> the race userspace will process the notification and it will know
> about it.

I agree with Hans here that I think it would be better if we could do
it with one property.

 * I can imagine demand for laptops that have a "hardware kill switch"
for privacy screen (just like there are for camera etc today). So I
think in future we may have to deal with this case anyway. In such
devices it's the hardware (as opposite to firmware) that will change
the state. The HW will likely provide an interrupt to the software to
notify of the change. This is all imaginative at this point though.

* I think having 2 properties might be a confusing UAPI. Also, we have
existing properties like link-status that can be changed by both the
user and the hardware.

Thanks,

Rajat

>
> One thing which might be useful is a way to signal that the property
> is read-only in case we ever hit hw where that is the case.
>
> > I'd so very much like to have no in-kernel/in-firmware shortcuts
> > to enable/disable the privacy screen, and instead have any hardware
> > buttons just be events that the userspace could react to. However I
> > don't think that'll be the case unfortunately.
>
> In my experience with keyboard-backlight support, we will (unfortunately)
> see a mix and in some case we will get a notification that the firmware
> has adjusted the state, rather then just getting a keypress and
> dealing with that ourselves.  In some cases we may even be able to
> choose, so the fw will deal with it by default but we can ask it
> to just send a key-press.  But I do believe that we can *not* expect
> that we will always just get a keypress for userspace to deal with.
>
> Regards,
>
> Hans
>
>
> *) Some udev event I guess, I sorta assume there already is a
> notification mechanism for property change notifications ?
>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 18:29                 ` Daniel Vetter
@ 2020-04-15 19:50                   ` Hans de Goede
  2020-04-16  6:46                     ` Daniel Vetter
  0 siblings, 1 reply; 54+ messages in thread
From: Hans de Goede @ 2020-04-15 19:50 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Benjamin Berg, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson

Hi,

On 4/15/20 8:29 PM, Daniel Vetter wrote:
> On Wed, Apr 15, 2020 at 8:19 PM Hans de Goede <hdegoede@redhat.com> wrote:
>>
>> Hi,
>>
>> On 4/15/20 7:54 PM, Daniel Vetter wrote:
>>> On Wed, Apr 15, 2020 at 03:02:53PM +0200, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 4/15/20 2:01 PM, Daniel Vetter wrote:
>>>>> On Wed, Apr 15, 2020 at 01:39:23PM +0200, Hans de Goede wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 4/15/20 12:22 PM, Daniel Vetter wrote:
>>>>>>> On Wed, Apr 15, 2020 at 12:11 PM Hans de Goede <hdegoede@redhat.com> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 4/15/20 11:52 AM, Daniel Vetter wrote:
>>>>>>
>>>>>> <snip>
>>>>>>
>>>>>>>>> iv. What every SoC subsystem does:
>>>>>>>>>
>>>>>>>>> - lcdshadow drivers register drivers
>>>>>>>>> - drm drivers look them up
>>>>>>>>> - if stuff isn't there yet, we delay loading with EPROBE_DEFER until
>>>>>>>>> the entire thing is assembled.
>>>>>>>>>
>>>>>>>>> That's what we're doing already for other standardized components like
>>>>>>>>> drm_bridge or drm_panel, and I think that's also the right approach
>>>>>>>>> for backlight and anything else like that. Hand-rolling our own
>>>>>>>>> EPROBE_DEFER handling, or some other duct-tape monsters imo just leads
>>>>>>>>> to real pain. Also, with EPROBE_DEFER we have one standard way of
>>>>>>>>> building a driver from component, which spans subsystems and is also
>>>>>>>>> the underlying magic that makes stuff like component.c work.
>>>>>>>>
>>>>>>>> On the SoCs we have devicetree telling us what components there
>>>>>>>> are, so we can wait for them to show up. The only way to figure out
>>>>>>>> if the lcdshadow thing is there on a ThinkPad is asking thinkpad_acpi,
>>>>>>>> or duplicating a lot of code from thinkpad_acpi. Edit:
>>>>>>>> also see below for a possible solution.
>>>>>>>
>>>>>>> Yup it sucks. I think all we can do is have a small acpi match
>>>>>>> function (which yes will duplicate some of the thinkpad_acpi driver
>>>>>>> logic) to re-create that information and give us a "should we have a
>>>>>>> lcdshadow driver for this $pci_device" answer.
>>>>>>
>>>>>> Ok, so questions about this solution:
>>>>>>
>>>>>> 1. Where should that match-function live ?
>>>>>>
>>>>>> 2. An acpi_thinkpad derived match-function will only be able to
>>>>>>       answer if there is an lcdshadow device/driver for the internal
>>>>>>       panel. It will not be able to tie this info to a certain PCI
>>>>>>       device. My plan is to pass NULL as dev_name when registering
>>>>>>       the lcdshadow-device and have lcdshadow_get(dev, <connector-name>)
>>>>>>       skip device-name matching (consider everything a match) for
>>>>>>       lcdshadow-devices registered with NULL as dev_name.
>>>>>>
>>>>>>       So I guess in this case the mini match function should just
>>>>>>       ignore the passed in device?
>>>>>
>>>>> Yeah I think we can't really avoid that. I also expect that we'll need
>>>>> ACPI and dt versions of this, and driver needs to know which one to call.
>>>>> Since at least in a dt world the driver knows exactly for which dt node it
>>>>> needs a lcdshadow driver for (with the phandle stuff), so we can be a lot
>>>>> more strict.
>>>>>
>>>>> For the acpi version I'm not even sure we can do more than provide the
>>>>> struct device * pointer of the gpu. I think if we ever get more than 1
>>>>> lcdshadow driver on acpi systems we can add more stuff later on, for now
>>>>> I'd just leave that out.
>>>>>
>>>>> So maybe
>>>>>
>>>>> acpi_lcdshadow_get(struct device *dev);
>>>>>
>>>>> of_lcdshadow_get(struct device_node *np);
>>>>>
>>>>> And with maybe a future plan to add some kind of enum or whatever to
>>>>> acpi_lcdshadow_get(). Both would return either the lcdshadow pointer, or
>>>>> an PTR_ERR() so that we could encode EPROBE_DEFER vs ENOENT.
>>>>
>>>> Ok, note I plan to only implement the acpi version for now, I do
>>>> expect some non ACPI/x86 devices to show up with his feature
>>>> eventually but I believe it is best to implement this once
>>>> those actually show up. Esp. since this will also involve adding
>>>> some devicetree bindings for this.
>>>
>>> ofc, just wanted to lay out the entire thing. The DT version needs some
>>> good bikeshed on the dt schema first anyway (so that the helper can decode
>>> that directly).
>>>
>>>>> We might also want a low-level lcdshadow_get() which only returns ENOENT
>>>>> when the driver isn't there, and which leaves "do we really need one?" to
>>>>> higher levels to answer.
>>>>
>>>> Right, so my latest idea on that is indeed a high-level lcdshadow_get()
>>>> which takes a struct device * and a connector-name and which never
>>>> returns EPROBE_DEFER.
>>>>
>>>> As for leaving things to the higher levels to answer, as explained
>>>> in my other follow-up email I think that we should probably add a
>>>> lcdshadow_probe_defer() helper for this and call that early on
>>>> in the PCI-driver probe functions for the 3 major x86 GPU drivers.
>>>> Does that sound ok to you?
>>>
>>> Uh ... not pretty. There's still a lifetime problem that strictly speaking
>>> there's nothing stopping the other driver from getting unloaded between
>>> your _probe_defer and the subsequent _get. I think fixing this properly
>>> (and screaming a bit at the error code, oh well) is better.
>>
>> I would really like to separate the discussion and the work
>> on getting the 3 major x86 GPU drivers ready to deal with EPROBE_DEFER
>> from the lcdshadow discussion and work.  I expect getting these
>> 3 drivers ready for EPROBE_DEFER is going to be a major undertaking
>> and I would like avoid introducing this significant scope creep
>> to the lcdshadow discussion, because it simply is a too big undertaking
>> to undertake without us getting a significant amount of manpower
>> specifically for this from somewhere.
>>
>> Note I do agree with you that getting these 3 drivers ready
>> for EPROBE_DEFER handling is a worthwhile undertaking, but
>> it simply will take too much extra time and as such IMHO it
>> really is out of scope for the lcdshadow stuff.
>> I expect the amount of work (esp. also dealing with testing
>> and regressions) for the EPROBE_DEFER project by itself
>> to be a lot *more* work then the actual lcdshadow project.
>>
>> So going with the assumption/decision that adding proper
>> EPROBE_DEFER handling to these 3 drivers is out of scope,
>> I believe that adding a lcdshadow_probe_defer() helper is
>> an ok solution/workaround for now.
>>
>> As for your case of the other driver getting unloaded in between
>> the check and use of it, that can only happen by explicit user
>> action and in that case the worst thing what will happen
>> is the "privacy-screen" property missing from the connector,
>> which in that case is more or less exactly what the user
>> asked for.
> 
> For i915 we've had patches, for mei-hdcp integration. Until it became
> clear that the mei subsystem is bonkers, and handles suspend/resume by
> unloading! drivers.
> 
> Which forced i915 to unload and broke the world :-/
> 
> So at least for i915 the work should be done already, and just amount
> to updating the patches Ram already had. No ideas about the other 2.

Ok.

> What I don't really want is an interface with races. So if fixing the
> other drivers is too much work, what we could do is roughly:
> 
> - in the top-level probe function to an acpi_lcdshadow_get. This might
> fail with EPROBE_DEFER.
> - this gives us a dangling reference, but we can drop that one when we
> exit the top-level probe function (both on success and on error cases)
> - the 2nd acpi_lcdshadow_get call deep down should always succeed at
> that point (since the top level holds a reference), so you could wrap
> that in a WARNING(IS_ERR_PTR()). Ok that's a lie, if we concurrently
> unload then the 2nd call still fails (the reference can never prevent
> a hotunbind in the linux kernel device model), so this is exactly as
> buggy as your version, so still needs a comment about that.
> 
> Adding a acpi_lcdshadow_probe_defer() function otoh just gives people
> the impression that that's actually a correct way of doing this.
> 
> Then it's up to the driver maintainers whether they're ok with the
> above hack of a fake reference, or not. As I said, I think for i915 it
> should be fairly ok to just roll it out, but maybe the patches don't
> apply at all anymore.

Ok trying to taking a ref early on and handling EPROBE_DEFER
at that first attempt to take a ref works for me. So lets go with
that. I will try to whip up a v1 patch-set for this, ETA aprox.
1-2 weeks I guess.

> btw to make everything work you also need to set up a device_link
> between the lcdshadow device (and it's driver, that's the more
> important thing) and the gpu device. That device link will make sure
> that
> - suspend/resume is done in the right order
> - driver load/unload is done in the right order, i.e. unloading of the
> lcdshadow driver will automatically force an unbind of the gpu driver
> first.

That is an interesting idea, but that does assume that their
is an actual struct device which the code handling the lcdshadow
binds to, which in case of thinkpad_acpi is not really the case.

Anyways passing in a parent device when registering a lcdshadow_dev
seems like a good idea and we can do the device_link thing if the parent
is non NULL.

Regards,

Hans

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 19:20     ` Rajat Jain
@ 2020-04-15 21:10       ` Jani Nikula
  2020-04-15 21:21         ` Hans de Goede
  2020-04-17  9:05         ` Pekka Paalanen
  0 siblings, 2 replies; 54+ messages in thread
From: Jani Nikula @ 2020-04-15 21:10 UTC (permalink / raw)
  To: Rajat Jain, Hans de Goede
  Cc: , Thomas Zimmermann, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Nitin Joshi1, Mark Pearson,
	Benjamin Berg

On Wed, 15 Apr 2020, Rajat Jain <rajatja@google.com> wrote:
> Hello,
>
> On Wed, Apr 15, 2020 at 8:40 AM Hans de Goede <hdegoede@redhat.com> wrote:
>>
>> Hi,
>>
>> On 4/15/20 5:28 PM, Jani Nikula wrote:
>> > On Wed, 15 Apr 2020, Hans de Goede <hdegoede@redhat.com> wrote:
>> >> ii. Currently the "privacy-screen" property added by Rajat's
>> >> patch-set is an enum with 2 possible values:
>> >> "Enabled"
>> >> "Disabled"
>> >>
>> >> We could add a third value "Not Available", which would be the
>> >> default and then for internal panels always add the property
>> >> so that we avoid the problem that detecting if the laptop has
>> >> an internal privacy screen needs to be done before the connector
>> >> is registered. Then we can add some hooks which allow an
>> >> lcdshadow-driver to register itself against a connector later
>> >> (which is non trivial wrt probe order, but lets ignore that for now).
>> >
>> > I regret dropping the ball on Rajat's series (sorry!).
>> >
>> > I do think having the connector property for this is the way to go.
>>
>> I 100% agree.
>>
>> > Even
>> > if we couldn't necessarily figure out all the details on the kernel
>> > internal connections, can we settle on the property though, so we could
>> > move forward with Rajat's series?
>
> Thanks, it would be great!.
>
>>
>> Yes please, this will also allow us to move forward with userspace
>> support even if for testing that we do some hacks for the kernel's
>> internal connections for now.
>>
>> > Moreover, do we actually need two properties, one which could indicate
>> > userspace's desire for the property, and another that tells the hardware
>> > state?
>>
>> No I do not think so. I would expect there to just be one property,
>> I guess that if the state is (partly) firmware controlled then there
>> might be a race, but we will need a notification mechanism (*) for
>> firmware triggered state changes anyways, so shortly after loosing
>> the race userspace will process the notification and it will know
>> about it.
>
> I agree with Hans here that I think it would be better if we could do
> it with one property.
>
>  * I can imagine demand for laptops that have a "hardware kill switch"
> for privacy screen (just like there are for camera etc today). So I
> think in future we may have to deal with this case anyway. In such
> devices it's the hardware (as opposite to firmware) that will change
> the state. The HW will likely provide an interrupt to the software to
> notify of the change. This is all imaginative at this point though.
>
> * I think having 2 properties might be a confusing UAPI. Also, we have
> existing properties like link-status that can be changed by both the
> user and the hardware.

I think the consensus is that all properties that get changed by both
userspace and the kernel are mistakes, and the way to handle it is to
have two properties.

BR,
Jani.


>
> Thanks,
>
> Rajat
>
>>
>> One thing which might be useful is a way to signal that the property
>> is read-only in case we ever hit hw where that is the case.
>>
>> > I'd so very much like to have no in-kernel/in-firmware shortcuts
>> > to enable/disable the privacy screen, and instead have any hardware
>> > buttons just be events that the userspace could react to. However I
>> > don't think that'll be the case unfortunately.
>>
>> In my experience with keyboard-backlight support, we will (unfortunately)
>> see a mix and in some case we will get a notification that the firmware
>> has adjusted the state, rather then just getting a keypress and
>> dealing with that ourselves.  In some cases we may even be able to
>> choose, so the fw will deal with it by default but we can ask it
>> to just send a key-press.  But I do believe that we can *not* expect
>> that we will always just get a keypress for userspace to deal with.
>>
>> Regards,
>>
>> Hans
>>
>>
>> *) Some udev event I guess, I sorta assume there already is a
>> notification mechanism for property change notifications ?
>>
>>

-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 21:10       ` Jani Nikula
@ 2020-04-15 21:21         ` Hans de Goede
  2020-04-15 21:51           ` [External] " Mark Pearson
  2020-04-17  9:05         ` Pekka Paalanen
  1 sibling, 1 reply; 54+ messages in thread
From: Hans de Goede @ 2020-04-15 21:21 UTC (permalink / raw)
  To: Jani Nikula, Rajat Jain
  Cc: Thomas Zimmermann, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Nitin Joshi1, Mark Pearson,
	Benjamin Berg

Hi,

On 4/15/20 11:10 PM, Jani Nikula wrote:
> On Wed, 15 Apr 2020, Rajat Jain <rajatja@google.com> wrote:
>> Hello,
>>
>> On Wed, Apr 15, 2020 at 8:40 AM Hans de Goede <hdegoede@redhat.com> wrote:
>>>
>>> Hi,
>>>
>>> On 4/15/20 5:28 PM, Jani Nikula wrote:
>>>> On Wed, 15 Apr 2020, Hans de Goede <hdegoede@redhat.com> wrote:
>>>>> ii. Currently the "privacy-screen" property added by Rajat's
>>>>> patch-set is an enum with 2 possible values:
>>>>> "Enabled"
>>>>> "Disabled"
>>>>>
>>>>> We could add a third value "Not Available", which would be the
>>>>> default and then for internal panels always add the property
>>>>> so that we avoid the problem that detecting if the laptop has
>>>>> an internal privacy screen needs to be done before the connector
>>>>> is registered. Then we can add some hooks which allow an
>>>>> lcdshadow-driver to register itself against a connector later
>>>>> (which is non trivial wrt probe order, but lets ignore that for now).
>>>>
>>>> I regret dropping the ball on Rajat's series (sorry!).
>>>>
>>>> I do think having the connector property for this is the way to go.
>>>
>>> I 100% agree.
>>>
>>>> Even
>>>> if we couldn't necessarily figure out all the details on the kernel
>>>> internal connections, can we settle on the property though, so we could
>>>> move forward with Rajat's series?
>>
>> Thanks, it would be great!.
>>
>>>
>>> Yes please, this will also allow us to move forward with userspace
>>> support even if for testing that we do some hacks for the kernel's
>>> internal connections for now.
>>>
>>>> Moreover, do we actually need two properties, one which could indicate
>>>> userspace's desire for the property, and another that tells the hardware
>>>> state?
>>>
>>> No I do not think so. I would expect there to just be one property,
>>> I guess that if the state is (partly) firmware controlled then there
>>> might be a race, but we will need a notification mechanism (*) for
>>> firmware triggered state changes anyways, so shortly after loosing
>>> the race userspace will process the notification and it will know
>>> about it.
>>
>> I agree with Hans here that I think it would be better if we could do
>> it with one property.
>>
>>   * I can imagine demand for laptops that have a "hardware kill switch"
>> for privacy screen (just like there are for camera etc today). So I
>> think in future we may have to deal with this case anyway. In such
>> devices it's the hardware (as opposite to firmware) that will change
>> the state. The HW will likely provide an interrupt to the software to
>> notify of the change. This is all imaginative at this point though.
>>
>> * I think having 2 properties might be a confusing UAPI. Also, we have
>> existing properties like link-status that can be changed by both the
>> user and the hardware.
> 
> I think the consensus is that all properties that get changed by both
> userspace and the kernel are mistakes, and the way to handle it is to
> have two properties.

But the actual privacy screen has only 1 state, having two properties
for this will only be confusing. As I mentioned before we have a similar
case with e.g. keyboard backlighting and there the userspace API
also has a single sysfs attribute for the brightness, with change
notifications to userspace if the firmware changes the brightness
on its own and this works well.

What would the semantics of these 2 different properties be? And
what sort of extra functionality would these semantics offer which
the single property versions does not offer?

Regards,

Hans

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [External] Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 21:21         ` Hans de Goede
@ 2020-04-15 21:51           ` Mark Pearson
  0 siblings, 0 replies; 54+ messages in thread
From: Mark Pearson @ 2020-04-15 21:51 UTC (permalink / raw)
  To: Hans de Goede, Jani Nikula, Rajat Jain
  Cc: Thomas Zimmermann, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Nitin Joshi1, Benjamin Berg

Hi,

> -----Original Message-----
> From: Hans de Goede <hdegoede@redhat.com>
> Sent: Wednesday, April 15, 2020 5:21 PM
> 
> Hi,
> 
> On 4/15/20 11:10 PM, Jani Nikula wrote:
> > On Wed, 15 Apr 2020, Rajat Jain <rajatja@google.com> wrote:
> >> Hello,
> >>
> >> On Wed, Apr 15, 2020 at 8:40 AM Hans de Goede
> <hdegoede@redhat.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> On 4/15/20 5:28 PM, Jani Nikula wrote:
> >>>> On Wed, 15 Apr 2020, Hans de Goede <hdegoede@redhat.com>
> wrote:
> >>>>> ii. Currently the "privacy-screen" property added by Rajat's
> >>>>> patch-set is an enum with 2 possible values:
> >>>>> "Enabled"
> >>>>> "Disabled"
> >>>>>
> >>>>> We could add a third value "Not Available", which would be the
> >>>>> default and then for internal panels always add the property
> >>>>> so that we avoid the problem that detecting if the laptop has
> >>>>> an internal privacy screen needs to be done before the connector
> >>>>> is registered. Then we can add some hooks which allow an
> >>>>> lcdshadow-driver to register itself against a connector later
> >>>>> (which is non trivial wrt probe order, but lets ignore that for now).
> >>>>
> >>>> I regret dropping the ball on Rajat's series (sorry!).
> >>>>
> >>>> I do think having the connector property for this is the way to go.
> >>>
> >>> I 100% agree.
> >>>
> >>>> Even
> >>>> if we couldn't necessarily figure out all the details on the kernel
> >>>> internal connections, can we settle on the property though, so we could
> >>>> move forward with Rajat's series?
> >>
> >> Thanks, it would be great!.
> >>
> >>>
> >>> Yes please, this will also allow us to move forward with userspace
> >>> support even if for testing that we do some hacks for the kernel's
> >>> internal connections for now.
> >>>
> >>>> Moreover, do we actually need two properties, one which could
> indicate
> >>>> userspace's desire for the property, and another that tells the hardware
> >>>> state?
> >>>
> >>> No I do not think so. I would expect there to just be one property,
> >>> I guess that if the state is (partly) firmware controlled then there
> >>> might be a race, but we will need a notification mechanism (*) for
> >>> firmware triggered state changes anyways, so shortly after loosing
> >>> the race userspace will process the notification and it will know
> >>> about it.
> >>
> >> I agree with Hans here that I think it would be better if we could do
> >> it with one property.
> >>
> >>   * I can imagine demand for laptops that have a "hardware kill switch"
> >> for privacy screen (just like there are for camera etc today). So I
> >> think in future we may have to deal with this case anyway. In such
> >> devices it's the hardware (as opposite to firmware) that will change
> >> the state. The HW will likely provide an interrupt to the software to
> >> notify of the change. This is all imaginative at this point though.
> >>
> >> * I think having 2 properties might be a confusing UAPI. Also, we have
> >> existing properties like link-status that can be changed by both the
> >> user and the hardware.
> >
> > I think the consensus is that all properties that get changed by both
> > userspace and the kernel are mistakes, and the way to handle it is to
> > have two properties.
> 
> But the actual privacy screen has only 1 state, having two properties
> for this will only be confusing. As I mentioned before we have a similar
> case with e.g. keyboard backlighting and there the userspace API
> also has a single sysfs attribute for the brightness, with change
> notifications to userspace if the firmware changes the brightness
> on its own and this works well.
> 
> What would the semantics of these 2 different properties be? And
> what sort of extra functionality would these semantics offer which
> the single property versions does not offer?
> 
I'm worried I'm missing the point but logically this doesn't make sense to me 
either. We're just providing two mechanisms for controlling ePrivacy - neither 
is wrong. I'm not sure where the consensus was reached and why - I'm afraid I 
don't have that history. 
Not sure it's the best example but you can control the system volume using 
buttons or the SW slider. It's still just the volume. Isn't this similar (ish). Both 
have their place and benefits.

What if users want the hotkey functionality? It is after all quite helpful: 
doing Fn+d is a lot quicker than finding the control box and toggling something 
if someone is looking over your shoulder and you don't want them to see. I'm 
not sure why a user would want it disabled - as long as it is tied in properly with 
the user space reporting.

<Lenovo hat on>
On that note - (I started replying in an earlier thread but got side tracked) there is 
little chance the BIOS team will make significant changes to existing BIOS
for adding extra disable functionality. There are just too many repercussions 
on testing as it impacts Windows. I know it sucks but it's just the way it is and I 
don't think there is a strong enough story to really push for it. Let me know if I'm 
out to lunch or being unreasonable 😊I'm happy to have that conversation with them
but I already know the results...maybe for future platforms but there are issues there
too (when shipping Linux systems it's just another setting to tweak and we're trying to
reduce them not increase them...but that's another story)

Mark

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 19:50                   ` Hans de Goede
@ 2020-04-16  6:46                     ` Daniel Vetter
  0 siblings, 0 replies; 54+ messages in thread
From: Daniel Vetter @ 2020-04-16  6:46 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Benjamin Berg, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson

On Wed, Apr 15, 2020 at 9:50 PM Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi,
>
> On 4/15/20 8:29 PM, Daniel Vetter wrote:
> > On Wed, Apr 15, 2020 at 8:19 PM Hans de Goede <hdegoede@redhat.com> wrote:
> >>
> >> Hi,
> >>
> >> On 4/15/20 7:54 PM, Daniel Vetter wrote:
> >>> On Wed, Apr 15, 2020 at 03:02:53PM +0200, Hans de Goede wrote:
> >>>> Hi,
> >>>>
> >>>> On 4/15/20 2:01 PM, Daniel Vetter wrote:
> >>>>> On Wed, Apr 15, 2020 at 01:39:23PM +0200, Hans de Goede wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> On 4/15/20 12:22 PM, Daniel Vetter wrote:
> >>>>>>> On Wed, Apr 15, 2020 at 12:11 PM Hans de Goede <hdegoede@redhat.com> wrote:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> On 4/15/20 11:52 AM, Daniel Vetter wrote:
> >>>>>>
> >>>>>> <snip>
> >>>>>>
> >>>>>>>>> iv. What every SoC subsystem does:
> >>>>>>>>>
> >>>>>>>>> - lcdshadow drivers register drivers
> >>>>>>>>> - drm drivers look them up
> >>>>>>>>> - if stuff isn't there yet, we delay loading with EPROBE_DEFER until
> >>>>>>>>> the entire thing is assembled.
> >>>>>>>>>
> >>>>>>>>> That's what we're doing already for other standardized components like
> >>>>>>>>> drm_bridge or drm_panel, and I think that's also the right approach
> >>>>>>>>> for backlight and anything else like that. Hand-rolling our own
> >>>>>>>>> EPROBE_DEFER handling, or some other duct-tape monsters imo just leads
> >>>>>>>>> to real pain. Also, with EPROBE_DEFER we have one standard way of
> >>>>>>>>> building a driver from component, which spans subsystems and is also
> >>>>>>>>> the underlying magic that makes stuff like component.c work.
> >>>>>>>>
> >>>>>>>> On the SoCs we have devicetree telling us what components there
> >>>>>>>> are, so we can wait for them to show up. The only way to figure out
> >>>>>>>> if the lcdshadow thing is there on a ThinkPad is asking thinkpad_acpi,
> >>>>>>>> or duplicating a lot of code from thinkpad_acpi. Edit:
> >>>>>>>> also see below for a possible solution.
> >>>>>>>
> >>>>>>> Yup it sucks. I think all we can do is have a small acpi match
> >>>>>>> function (which yes will duplicate some of the thinkpad_acpi driver
> >>>>>>> logic) to re-create that information and give us a "should we have a
> >>>>>>> lcdshadow driver for this $pci_device" answer.
> >>>>>>
> >>>>>> Ok, so questions about this solution:
> >>>>>>
> >>>>>> 1. Where should that match-function live ?
> >>>>>>
> >>>>>> 2. An acpi_thinkpad derived match-function will only be able to
> >>>>>>       answer if there is an lcdshadow device/driver for the internal
> >>>>>>       panel. It will not be able to tie this info to a certain PCI
> >>>>>>       device. My plan is to pass NULL as dev_name when registering
> >>>>>>       the lcdshadow-device and have lcdshadow_get(dev, <connector-name>)
> >>>>>>       skip device-name matching (consider everything a match) for
> >>>>>>       lcdshadow-devices registered with NULL as dev_name.
> >>>>>>
> >>>>>>       So I guess in this case the mini match function should just
> >>>>>>       ignore the passed in device?
> >>>>>
> >>>>> Yeah I think we can't really avoid that. I also expect that we'll need
> >>>>> ACPI and dt versions of this, and driver needs to know which one to call.
> >>>>> Since at least in a dt world the driver knows exactly for which dt node it
> >>>>> needs a lcdshadow driver for (with the phandle stuff), so we can be a lot
> >>>>> more strict.
> >>>>>
> >>>>> For the acpi version I'm not even sure we can do more than provide the
> >>>>> struct device * pointer of the gpu. I think if we ever get more than 1
> >>>>> lcdshadow driver on acpi systems we can add more stuff later on, for now
> >>>>> I'd just leave that out.
> >>>>>
> >>>>> So maybe
> >>>>>
> >>>>> acpi_lcdshadow_get(struct device *dev);
> >>>>>
> >>>>> of_lcdshadow_get(struct device_node *np);
> >>>>>
> >>>>> And with maybe a future plan to add some kind of enum or whatever to
> >>>>> acpi_lcdshadow_get(). Both would return either the lcdshadow pointer, or
> >>>>> an PTR_ERR() so that we could encode EPROBE_DEFER vs ENOENT.
> >>>>
> >>>> Ok, note I plan to only implement the acpi version for now, I do
> >>>> expect some non ACPI/x86 devices to show up with his feature
> >>>> eventually but I believe it is best to implement this once
> >>>> those actually show up. Esp. since this will also involve adding
> >>>> some devicetree bindings for this.
> >>>
> >>> ofc, just wanted to lay out the entire thing. The DT version needs some
> >>> good bikeshed on the dt schema first anyway (so that the helper can decode
> >>> that directly).
> >>>
> >>>>> We might also want a low-level lcdshadow_get() which only returns ENOENT
> >>>>> when the driver isn't there, and which leaves "do we really need one?" to
> >>>>> higher levels to answer.
> >>>>
> >>>> Right, so my latest idea on that is indeed a high-level lcdshadow_get()
> >>>> which takes a struct device * and a connector-name and which never
> >>>> returns EPROBE_DEFER.
> >>>>
> >>>> As for leaving things to the higher levels to answer, as explained
> >>>> in my other follow-up email I think that we should probably add a
> >>>> lcdshadow_probe_defer() helper for this and call that early on
> >>>> in the PCI-driver probe functions for the 3 major x86 GPU drivers.
> >>>> Does that sound ok to you?
> >>>
> >>> Uh ... not pretty. There's still a lifetime problem that strictly speaking
> >>> there's nothing stopping the other driver from getting unloaded between
> >>> your _probe_defer and the subsequent _get. I think fixing this properly
> >>> (and screaming a bit at the error code, oh well) is better.
> >>
> >> I would really like to separate the discussion and the work
> >> on getting the 3 major x86 GPU drivers ready to deal with EPROBE_DEFER
> >> from the lcdshadow discussion and work.  I expect getting these
> >> 3 drivers ready for EPROBE_DEFER is going to be a major undertaking
> >> and I would like avoid introducing this significant scope creep
> >> to the lcdshadow discussion, because it simply is a too big undertaking
> >> to undertake without us getting a significant amount of manpower
> >> specifically for this from somewhere.
> >>
> >> Note I do agree with you that getting these 3 drivers ready
> >> for EPROBE_DEFER handling is a worthwhile undertaking, but
> >> it simply will take too much extra time and as such IMHO it
> >> really is out of scope for the lcdshadow stuff.
> >> I expect the amount of work (esp. also dealing with testing
> >> and regressions) for the EPROBE_DEFER project by itself
> >> to be a lot *more* work then the actual lcdshadow project.
> >>
> >> So going with the assumption/decision that adding proper
> >> EPROBE_DEFER handling to these 3 drivers is out of scope,
> >> I believe that adding a lcdshadow_probe_defer() helper is
> >> an ok solution/workaround for now.
> >>
> >> As for your case of the other driver getting unloaded in between
> >> the check and use of it, that can only happen by explicit user
> >> action and in that case the worst thing what will happen
> >> is the "privacy-screen" property missing from the connector,
> >> which in that case is more or less exactly what the user
> >> asked for.
> >
> > For i915 we've had patches, for mei-hdcp integration. Until it became
> > clear that the mei subsystem is bonkers, and handles suspend/resume by
> > unloading! drivers.
> >
> > Which forced i915 to unload and broke the world :-/
> >
> > So at least for i915 the work should be done already, and just amount
> > to updating the patches Ram already had. No ideas about the other 2.
>
> Ok.
>
> > What I don't really want is an interface with races. So if fixing the
> > other drivers is too much work, what we could do is roughly:
> >
> > - in the top-level probe function to an acpi_lcdshadow_get. This might
> > fail with EPROBE_DEFER.
> > - this gives us a dangling reference, but we can drop that one when we
> > exit the top-level probe function (both on success and on error cases)
> > - the 2nd acpi_lcdshadow_get call deep down should always succeed at
> > that point (since the top level holds a reference), so you could wrap
> > that in a WARNING(IS_ERR_PTR()). Ok that's a lie, if we concurrently
> > unload then the 2nd call still fails (the reference can never prevent
> > a hotunbind in the linux kernel device model), so this is exactly as
> > buggy as your version, so still needs a comment about that.
> >
> > Adding a acpi_lcdshadow_probe_defer() function otoh just gives people
> > the impression that that's actually a correct way of doing this.
> >
> > Then it's up to the driver maintainers whether they're ok with the
> > above hack of a fake reference, or not. As I said, I think for i915 it
> > should be fairly ok to just roll it out, but maybe the patches don't
> > apply at all anymore.
>
> Ok trying to taking a ref early on and handling EPROBE_DEFER
> at that first attempt to take a ref works for me. So lets go with
> that. I will try to whip up a v1 patch-set for this, ETA aprox.
> 1-2 weeks I guess.
>
> > btw to make everything work you also need to set up a device_link
> > between the lcdshadow device (and it's driver, that's the more
> > important thing) and the gpu device. That device link will make sure
> > that
> > - suspend/resume is done in the right order
> > - driver load/unload is done in the right order, i.e. unloading of the
> > lcdshadow driver will automatically force an unbind of the gpu driver
> > first.
>
> That is an interesting idea, but that does assume that their
> is an actual struct device which the code handling the lcdshadow
> binds to, which in case of thinkpad_acpi is not really the case.
>
> Anyways passing in a parent device when registering a lcdshadow_dev
> seems like a good idea and we can do the device_link thing if the parent
> is non NULL.

For device_link to work, it needs to be the struct device the actual
driver is bound to. Otherwise the suspend/resume functions dont do
anything. No idea how this acpi stuff works, if it doesn't work like a
driver then we're a bit screwed perhaps ...

Anyway I guess for lcdshadow it's not that important, since not much
to suspend/resume. For backlight this will matter more I guess.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 15:40   ` Hans de Goede
  2020-04-15 17:14     ` [External] " Mark Pearson
  2020-04-15 19:20     ` Rajat Jain
@ 2020-04-17  9:02     ` Pekka Paalanen
  2020-04-17 11:55       ` Jani Nikula
  2020-04-17 14:17       ` Daniel Vetter
  2 siblings, 2 replies; 54+ messages in thread
From: Pekka Paalanen @ 2020-04-17  9:02 UTC (permalink / raw)
  To: Hans de Goede, Mark Pearson
  Cc: David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Thomas Zimmermann, Rajat Jain,
	Nitin Joshi1, Benjamin Berg


[-- Attachment #1.1: Type: text/plain, Size: 4851 bytes --]

On Wed, 15 Apr 2020 17:40:46 +0200
Hans de Goede <hdegoede@redhat.com> wrote:

> Hi,
> 
> On 4/15/20 5:28 PM, Jani Nikula wrote:
> > On Wed, 15 Apr 2020, Hans de Goede <hdegoede@redhat.com> wrote:  
> >> ii. Currently the "privacy-screen" property added by Rajat's
> >> patch-set is an enum with 2 possible values:
> >> "Enabled"
> >> "Disabled"
> >>
> >> We could add a third value "Not Available", which would be the
> >> default and then for internal panels always add the property
> >> so that we avoid the problem that detecting if the laptop has
> >> an internal privacy screen needs to be done before the connector
> >> is registered. Then we can add some hooks which allow an
> >> lcdshadow-driver to register itself against a connector later
> >> (which is non trivial wrt probe order, but lets ignore that for now).  
> > 
> > I regret dropping the ball on Rajat's series (sorry!).
> > 
> > I do think having the connector property for this is the way to go.  
> 
> I 100% agree.
> 
> > Even
> > if we couldn't necessarily figure out all the details on the kernel
> > internal connections, can we settle on the property though, so we could
> > move forward with Rajat's series?  
> 
> Yes please, this will also allow us to move forward with userspace
> support even if for testing that we do some hacks for the kernel's
> internal connections for now.
> 
> > Moreover, do we actually need two properties, one which could indicate
> > userspace's desire for the property, and another that tells the hardware
> > state?  
> 
> No I do not think so. I would expect there to just be one property,
> I guess that if the state is (partly) firmware controlled then there
> might be a race, but we will need a notification mechanism (*) for
> firmware triggered state changes anyways, so shortly after loosing
> the race userspace will process the notification and it will know
> about it.
> 
> One thing which might be useful is a way to signal that the property
> is read-only in case we ever hit hw where that is the case.
> 
> > I'd so very much like to have no in-kernel/in-firmware shortcuts
> > to enable/disable the privacy screen, and instead have any hardware
> > buttons just be events that the userspace could react to. However I
> > don't think that'll be the case unfortunately.  
> 
> In my experience with keyboard-backlight support, we will (unfortunately)
> see a mix and in some case we will get a notification that the firmware
> has adjusted the state, rather then just getting a keypress and
> dealing with that ourselves.  In some cases we may even be able to
> choose, so the fw will deal with it by default but we can ask it
> to just send a key-press.  But I do believe that we can *not* expect
> that we will always just get a keypress for userspace to deal with.

Hi,

let's think about how userspace uses atomic KMS UAPI. The simplest way
to use atomic correctly is that userspace will for every update send the
full, complete set of all properties that exist, both known and unknown
to userspace (to recover from temporarily VT-switching to another KMS
program that changes unknown properties). Attempting to track which
properties already have their correct values in the kernel is extra
work for just extra bugs.

Assuming the property is userspace-writable: if kernel goes and
changes the property value on its own, it will very likely be just
overwritten by userspace right after if userspace does not manage to
process the uevent first. If that happens and userspace later
processes the uevent, userspace queries the kernel for the current
proprerty state which is now what userspace wrote, not what firmware
set.

Therefore you end up with the firmware hotkey working only randomly.

It would be much better to have the hotkey events delivered to
userspace so that userspace can control the privacy screen and
everything will be reliable, both the hotkeys and any GUI for it. The
other reliable option is that userspace must never be able to change
privacy screen state, only the hardware hotkeys can.

> 
> Regards,
> 
> Hans
> 
> 
> *) Some udev event I guess, I sorta assume there already is a
> notification mechanism for property change notifications ?

Yes, such mechanism has been discussed before, see the thread containing
https://lists.freedesktop.org/archives/dri-devel/2019-May/217588.html

TL;DR: the mechanism exists and is called the hotplug event. But the
problem with the hotplug event is that it carries no information about
what changed, so userspace is forced re-discover everything about the
DRM device. That thread discusses extending the hotplug event to be
able to signal changes in individual properties rather than "something
maybe changed, you figure out what".


Thanks,
pq

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-15 21:10       ` Jani Nikula
  2020-04-15 21:21         ` Hans de Goede
@ 2020-04-17  9:05         ` Pekka Paalanen
  1 sibling, 0 replies; 54+ messages in thread
From: Pekka Paalanen @ 2020-04-17  9:05 UTC (permalink / raw)
  To: Rajat Jain, Hans de Goede, Mark Pearson
  Cc: Benjamin Berg, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Thomas Zimmermann,
	Nitin Joshi1


[-- Attachment #1.1: Type: text/plain, Size: 1222 bytes --]

On Thu, 16 Apr 2020 00:10:06 +0300
Jani Nikula <jani.nikula@linux.intel.com> wrote:

> On Wed, 15 Apr 2020, Rajat Jain <rajatja@google.com> wrote:

> > * I think having 2 properties might be a confusing UAPI. Also, we have
> > existing properties like link-status that can be changed by both the
> > user and the hardware.  
> 
> I think the consensus is that all properties that get changed by both
> userspace and the kernel are mistakes, and the way to handle it is to
> have two properties.

Yes, I very much agree with Jani.

I don't like KMS properties that are writable by both the driver and
userspace. It's awkward to handle in userspace and fundamentally racy
aside from tricks like "what you wrote is not what you read back". I
have ranted against that when looking at HDCP properties, e.g.:
https://lists.freedesktop.org/archives/dri-devel/2019-July/226424.html

See also my other email in this thread about how userspace uses atomic
KMS UAPI: you must expect that userspace will always write the
property for any KMS update, even if it does not change its value, so
any change done by the kernel is randomly lost unless the property is
read-only or otherwise weird.


Thanks,
pq

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-17  9:02     ` Pekka Paalanen
@ 2020-04-17 11:55       ` Jani Nikula
  2020-04-17 14:18         ` Daniel Vetter
  2020-04-21 12:37         ` Hans de Goede
  2020-04-17 14:17       ` Daniel Vetter
  1 sibling, 2 replies; 54+ messages in thread
From: Jani Nikula @ 2020-04-17 11:55 UTC (permalink / raw)
  To: Pekka Paalanen, Hans de Goede, Mark Pearson
  Cc: Thomas Zimmermann, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Rajat Jain, Nitin Joshi1,
	Benjamin Berg

On Fri, 17 Apr 2020, Pekka Paalanen <ppaalanen@gmail.com> wrote:
> On Wed, 15 Apr 2020 17:40:46 +0200
> Hans de Goede <hdegoede@redhat.com> wrote:
>
>> Hi,
>> 
>> On 4/15/20 5:28 PM, Jani Nikula wrote:
>> > On Wed, 15 Apr 2020, Hans de Goede <hdegoede@redhat.com> wrote:  
>> >> ii. Currently the "privacy-screen" property added by Rajat's
>> >> patch-set is an enum with 2 possible values:
>> >> "Enabled"
>> >> "Disabled"
>> >>
>> >> We could add a third value "Not Available", which would be the
>> >> default and then for internal panels always add the property
>> >> so that we avoid the problem that detecting if the laptop has
>> >> an internal privacy screen needs to be done before the connector
>> >> is registered. Then we can add some hooks which allow an
>> >> lcdshadow-driver to register itself against a connector later
>> >> (which is non trivial wrt probe order, but lets ignore that for now).  
>> > 
>> > I regret dropping the ball on Rajat's series (sorry!).
>> > 
>> > I do think having the connector property for this is the way to go.  
>> 
>> I 100% agree.
>> 
>> > Even
>> > if we couldn't necessarily figure out all the details on the kernel
>> > internal connections, can we settle on the property though, so we could
>> > move forward with Rajat's series?  
>> 
>> Yes please, this will also allow us to move forward with userspace
>> support even if for testing that we do some hacks for the kernel's
>> internal connections for now.
>> 
>> > Moreover, do we actually need two properties, one which could indicate
>> > userspace's desire for the property, and another that tells the hardware
>> > state?  
>> 
>> No I do not think so. I would expect there to just be one property,
>> I guess that if the state is (partly) firmware controlled then there
>> might be a race, but we will need a notification mechanism (*) for
>> firmware triggered state changes anyways, so shortly after loosing
>> the race userspace will process the notification and it will know
>> about it.
>> 
>> One thing which might be useful is a way to signal that the property
>> is read-only in case we ever hit hw where that is the case.
>> 
>> > I'd so very much like to have no in-kernel/in-firmware shortcuts
>> > to enable/disable the privacy screen, and instead have any hardware
>> > buttons just be events that the userspace could react to. However I
>> > don't think that'll be the case unfortunately.  
>> 
>> In my experience with keyboard-backlight support, we will (unfortunately)
>> see a mix and in some case we will get a notification that the firmware
>> has adjusted the state, rather then just getting a keypress and
>> dealing with that ourselves.  In some cases we may even be able to
>> choose, so the fw will deal with it by default but we can ask it
>> to just send a key-press.  But I do believe that we can *not* expect
>> that we will always just get a keypress for userspace to deal with.
>
> Hi,
>
> let's think about how userspace uses atomic KMS UAPI. The simplest way
> to use atomic correctly is that userspace will for every update send the
> full, complete set of all properties that exist, both known and unknown
> to userspace (to recover from temporarily VT-switching to another KMS
> program that changes unknown properties). Attempting to track which
> properties already have their correct values in the kernel is extra
> work for just extra bugs.
>
> Assuming the property is userspace-writable: if kernel goes and
> changes the property value on its own, it will very likely be just
> overwritten by userspace right after if userspace does not manage to
> process the uevent first. If that happens and userspace later
> processes the uevent, userspace queries the kernel for the current
> proprerty state which is now what userspace wrote, not what firmware
> set.
>
> Therefore you end up with the firmware hotkey working only randomly.
>
> It would be much better to have the hotkey events delivered to
> userspace so that userspace can control the privacy screen and
> everything will be reliable, both the hotkeys and any GUI for it.

I'd like this too. However I fear this is out of our control, and OEMs
have and will anyway fiddle with the privacy screen directly no matter
what we say, and we can't prevent that. From their POV it's easier for
them to do their value-add in components they have total control over. I
emphatize with that view, even if it's counter-productive from the Linux
ecosystem POV.

So we'll just have to deal with it.

> The other reliable option is that userspace must never be able to
> change privacy screen state, only the hardware hotkeys can.

That, in turn, discourages anyone from doing the right thing, and blocks
us from adding any nice additional features for privacy screens that
only the userspace is capable of managing. For example, controlling
privacy screen based on content, which seems like an obvious feature.

I suppose rf-kill is a bit similar.

You'd have a userspace controlled property to state what the userspace
wants, and a kernel controlled property to show the hardware
state. Userspace can ask for one or the other, and usually this happens,
but the hardware hotkey bypasses the whole thing.

It's not perfect. But I think just one property changed nilly-willy by
both the kernel and userspace (especially when it's really not in the
kernel's full control either) is going to be a PITA.

BR,
Jani.



>
>> 
>> Regards,
>> 
>> Hans
>> 
>> 
>> *) Some udev event I guess, I sorta assume there already is a
>> notification mechanism for property change notifications ?
>
> Yes, such mechanism has been discussed before, see the thread containing
> https://lists.freedesktop.org/archives/dri-devel/2019-May/217588.html
>
> TL;DR: the mechanism exists and is called the hotplug event. But the
> problem with the hotplug event is that it carries no information about
> what changed, so userspace is forced re-discover everything about the
> DRM device. That thread discusses extending the hotplug event to be
> able to signal changes in individual properties rather than "something
> maybe changed, you figure out what".
>
>
> Thanks,
> pq

-- 
Jani Nikula, Intel Open Source Graphics Center
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-17  9:02     ` Pekka Paalanen
  2020-04-17 11:55       ` Jani Nikula
@ 2020-04-17 14:17       ` Daniel Vetter
  2020-04-20  8:27         ` Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support) Pekka Paalanen
  1 sibling, 1 reply; 54+ messages in thread
From: Daniel Vetter @ 2020-04-17 14:17 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Benjamin Berg, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson

On Fri, Apr 17, 2020 at 11:02 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
>
> On Wed, 15 Apr 2020 17:40:46 +0200
> Hans de Goede <hdegoede@redhat.com> wrote:
>
> > Hi,
> >
> > On 4/15/20 5:28 PM, Jani Nikula wrote:
> > > On Wed, 15 Apr 2020, Hans de Goede <hdegoede@redhat.com> wrote:
> > >> ii. Currently the "privacy-screen" property added by Rajat's
> > >> patch-set is an enum with 2 possible values:
> > >> "Enabled"
> > >> "Disabled"
> > >>
> > >> We could add a third value "Not Available", which would be the
> > >> default and then for internal panels always add the property
> > >> so that we avoid the problem that detecting if the laptop has
> > >> an internal privacy screen needs to be done before the connector
> > >> is registered. Then we can add some hooks which allow an
> > >> lcdshadow-driver to register itself against a connector later
> > >> (which is non trivial wrt probe order, but lets ignore that for now).
> > >
> > > I regret dropping the ball on Rajat's series (sorry!).
> > >
> > > I do think having the connector property for this is the way to go.
> >
> > I 100% agree.
> >
> > > Even
> > > if we couldn't necessarily figure out all the details on the kernel
> > > internal connections, can we settle on the property though, so we could
> > > move forward with Rajat's series?
> >
> > Yes please, this will also allow us to move forward with userspace
> > support even if for testing that we do some hacks for the kernel's
> > internal connections for now.
> >
> > > Moreover, do we actually need two properties, one which could indicate
> > > userspace's desire for the property, and another that tells the hardware
> > > state?
> >
> > No I do not think so. I would expect there to just be one property,
> > I guess that if the state is (partly) firmware controlled then there
> > might be a race, but we will need a notification mechanism (*) for
> > firmware triggered state changes anyways, so shortly after loosing
> > the race userspace will process the notification and it will know
> > about it.
> >
> > One thing which might be useful is a way to signal that the property
> > is read-only in case we ever hit hw where that is the case.
> >
> > > I'd so very much like to have no in-kernel/in-firmware shortcuts
> > > to enable/disable the privacy screen, and instead have any hardware
> > > buttons just be events that the userspace could react to. However I
> > > don't think that'll be the case unfortunately.
> >
> > In my experience with keyboard-backlight support, we will (unfortunately)
> > see a mix and in some case we will get a notification that the firmware
> > has adjusted the state, rather then just getting a keypress and
> > dealing with that ourselves.  In some cases we may even be able to
> > choose, so the fw will deal with it by default but we can ask it
> > to just send a key-press.  But I do believe that we can *not* expect
> > that we will always just get a keypress for userspace to deal with.
>
> Hi,
>
> let's think about how userspace uses atomic KMS UAPI. The simplest way
> to use atomic correctly is that userspace will for every update send the
> full, complete set of all properties that exist, both known and unknown
> to userspace (to recover from temporarily VT-switching to another KMS
> program that changes unknown properties). Attempting to track which
> properties already have their correct values in the kernel is extra
> work for just extra bugs.

Uh if you do that you'll get random surprising failures if you don't
also set ALLOW_MODESET, because that way you'll automatically repair
link failures and stuff like that. I'm assuming your userspace only
supplies all the properties for crtc and planes, and leaves connectors
as-is? Otherwise you already have some fun bugs.

In general I'd say userspace shouldn't write stuff it doesn't
understand. If you limit yourself to just the properties you do want
to (re)set, that's safe. But if you just blindly write everything all
the time, random modesets, and hence random failures if you don't set
ALLOW_MODESET.

> Assuming the property is userspace-writable: if kernel goes and
> changes the property value on its own, it will very likely be just
> overwritten by userspace right after if userspace does not manage to
> process the uevent first. If that happens and userspace later
> processes the uevent, userspace queries the kernel for the current
> proprerty state which is now what userspace wrote, not what firmware
> set.
>
> Therefore you end up with the firmware hotkey working only randomly.
>
> It would be much better to have the hotkey events delivered to
> userspace so that userspace can control the privacy screen and
> everything will be reliable, both the hotkeys and any GUI for it. The
> other reliable option is that userspace must never be able to change
> privacy screen state, only the hardware hotkeys can.

We have fancy new uevents which give you both the connector and the
property, so you know what's going on.

Also, a property which userspace and the kernel can race like you
describe above is broken. We don't have these, and we won't merge
them.

The ones we do have the state transitions are a lot clearer, and
userspace overwriting what the kernel has done is not actually going
to cause a big pain. At least in the sense of the transition will be
lost, since for e.g. both link_status and hdcp the value the kernel
sets is not a value userspace can set. But it can result in problems
if you just blindly write them again causing modesets you'd not
expect.
-Daniel


> >
> > Regards,
> >
> > Hans
> >
> >
> > *) Some udev event I guess, I sorta assume there already is a
> > notification mechanism for property change notifications ?
>
> Yes, such mechanism has been discussed before, see the thread containing
> https://lists.freedesktop.org/archives/dri-devel/2019-May/217588.html
>
> TL;DR: the mechanism exists and is called the hotplug event. But the
> problem with the hotplug event is that it carries no information about
> what changed, so userspace is forced re-discover everything about the
> DRM device. That thread discusses extending the hotplug event to be
> able to signal changes in individual properties rather than "something
> maybe changed, you figure out what".
>
>
> Thanks,
> pq



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-17 11:55       ` Jani Nikula
@ 2020-04-17 14:18         ` Daniel Vetter
  2020-04-17 14:54           ` Benjamin Berg
  2020-04-21 12:37         ` Hans de Goede
  1 sibling, 1 reply; 54+ messages in thread
From: Daniel Vetter @ 2020-04-17 14:18 UTC (permalink / raw)
  To: Jani Nikula
  Cc: Benjamin Berg, David Airlie, Christian Kellner,
	Javier Martinez Canillas, Hans de Goede, dri-devel,
	Thomas Zimmermann, Nitin Joshi1, Rajat Jain, Mark Pearson

On Fri, Apr 17, 2020 at 1:55 PM Jani Nikula <jani.nikula@linux.intel.com> wrote:
>
> On Fri, 17 Apr 2020, Pekka Paalanen <ppaalanen@gmail.com> wrote:
> > On Wed, 15 Apr 2020 17:40:46 +0200
> > Hans de Goede <hdegoede@redhat.com> wrote:
> >
> >> Hi,
> >>
> >> On 4/15/20 5:28 PM, Jani Nikula wrote:
> >> > On Wed, 15 Apr 2020, Hans de Goede <hdegoede@redhat.com> wrote:
> >> >> ii. Currently the "privacy-screen" property added by Rajat's
> >> >> patch-set is an enum with 2 possible values:
> >> >> "Enabled"
> >> >> "Disabled"
> >> >>
> >> >> We could add a third value "Not Available", which would be the
> >> >> default and then for internal panels always add the property
> >> >> so that we avoid the problem that detecting if the laptop has
> >> >> an internal privacy screen needs to be done before the connector
> >> >> is registered. Then we can add some hooks which allow an
> >> >> lcdshadow-driver to register itself against a connector later
> >> >> (which is non trivial wrt probe order, but lets ignore that for now).
> >> >
> >> > I regret dropping the ball on Rajat's series (sorry!).
> >> >
> >> > I do think having the connector property for this is the way to go.
> >>
> >> I 100% agree.
> >>
> >> > Even
> >> > if we couldn't necessarily figure out all the details on the kernel
> >> > internal connections, can we settle on the property though, so we could
> >> > move forward with Rajat's series?
> >>
> >> Yes please, this will also allow us to move forward with userspace
> >> support even if for testing that we do some hacks for the kernel's
> >> internal connections for now.
> >>
> >> > Moreover, do we actually need two properties, one which could indicate
> >> > userspace's desire for the property, and another that tells the hardware
> >> > state?
> >>
> >> No I do not think so. I would expect there to just be one property,
> >> I guess that if the state is (partly) firmware controlled then there
> >> might be a race, but we will need a notification mechanism (*) for
> >> firmware triggered state changes anyways, so shortly after loosing
> >> the race userspace will process the notification and it will know
> >> about it.
> >>
> >> One thing which might be useful is a way to signal that the property
> >> is read-only in case we ever hit hw where that is the case.
> >>
> >> > I'd so very much like to have no in-kernel/in-firmware shortcuts
> >> > to enable/disable the privacy screen, and instead have any hardware
> >> > buttons just be events that the userspace could react to. However I
> >> > don't think that'll be the case unfortunately.
> >>
> >> In my experience with keyboard-backlight support, we will (unfortunately)
> >> see a mix and in some case we will get a notification that the firmware
> >> has adjusted the state, rather then just getting a keypress and
> >> dealing with that ourselves.  In some cases we may even be able to
> >> choose, so the fw will deal with it by default but we can ask it
> >> to just send a key-press.  But I do believe that we can *not* expect
> >> that we will always just get a keypress for userspace to deal with.
> >
> > Hi,
> >
> > let's think about how userspace uses atomic KMS UAPI. The simplest way
> > to use atomic correctly is that userspace will for every update send the
> > full, complete set of all properties that exist, both known and unknown
> > to userspace (to recover from temporarily VT-switching to another KMS
> > program that changes unknown properties). Attempting to track which
> > properties already have their correct values in the kernel is extra
> > work for just extra bugs.
> >
> > Assuming the property is userspace-writable: if kernel goes and
> > changes the property value on its own, it will very likely be just
> > overwritten by userspace right after if userspace does not manage to
> > process the uevent first. If that happens and userspace later
> > processes the uevent, userspace queries the kernel for the current
> > proprerty state which is now what userspace wrote, not what firmware
> > set.
> >
> > Therefore you end up with the firmware hotkey working only randomly.
> >
> > It would be much better to have the hotkey events delivered to
> > userspace so that userspace can control the privacy screen and
> > everything will be reliable, both the hotkeys and any GUI for it.
>
> I'd like this too. However I fear this is out of our control, and OEMs
> have and will anyway fiddle with the privacy screen directly no matter
> what we say, and we can't prevent that. From their POV it's easier for
> them to do their value-add in components they have total control over. I
> emphatize with that view, even if it's counter-productive from the Linux
> ecosystem POV.
>
> So we'll just have to deal with it.
>
> > The other reliable option is that userspace must never be able to
> > change privacy screen state, only the hardware hotkeys can.
>
> That, in turn, discourages anyone from doing the right thing, and blocks
> us from adding any nice additional features for privacy screens that
> only the userspace is capable of managing. For example, controlling
> privacy screen based on content, which seems like an obvious feature.
>
> I suppose rf-kill is a bit similar.
>
> You'd have a userspace controlled property to state what the userspace
> wants, and a kernel controlled property to show the hardware
> state. Userspace can ask for one or the other, and usually this happens,
> but the hardware hotkey bypasses the whole thing.
>
> It's not perfect. But I think just one property changed nilly-willy by
> both the kernel and userspace (especially when it's really not in the
> kernel's full control either) is going to be a PITA.

Yeah that's what we've done in other cases where both kernel and
userspace can change stuff. These where just tri-states, but this here
sounds like we need all for, so best to just have 2 properties.
-Daniel

>
> BR,
> Jani.
>
>
>
> >
> >>
> >> Regards,
> >>
> >> Hans
> >>
> >>
> >> *) Some udev event I guess, I sorta assume there already is a
> >> notification mechanism for property change notifications ?
> >
> > Yes, such mechanism has been discussed before, see the thread containing
> > https://lists.freedesktop.org/archives/dri-devel/2019-May/217588.html
> >
> > TL;DR: the mechanism exists and is called the hotplug event. But the
> > problem with the hotplug event is that it carries no information about
> > what changed, so userspace is forced re-discover everything about the
> > DRM device. That thread discusses extending the hotplug event to be
> > able to signal changes in individual properties rather than "something
> > maybe changed, you figure out what".
> >
> >
> > Thanks,
> > pq
>
> --
> Jani Nikula, Intel Open Source Graphics Center



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-17 14:18         ` Daniel Vetter
@ 2020-04-17 14:54           ` Benjamin Berg
  0 siblings, 0 replies; 54+ messages in thread
From: Benjamin Berg @ 2020-04-17 14:54 UTC (permalink / raw)
  To: Daniel Vetter, Jani Nikula
  Cc: David Airlie, Christian Kellner, Javier Martinez Canillas,
	Hans de Goede, dri-devel, Thomas Zimmermann, Nitin Joshi1,
	Rajat Jain, Mark Pearson


[-- Attachment #1.1: Type: text/plain, Size: 1806 bytes --]

On Fri, 2020-04-17 at 16:18 +0200, Daniel Vetter wrote:
> > I suppose rf-kill is a bit similar.

RfKill is actually much more complicated, and I don't really see how it
is related. We may be in the situation where we cannot control the
hardware state, but RfKill has two entirely separate "block" states and
the real value is the logical OR of both.

Also, RfKill key handling is a mess as userspace needs to tell the
kernel it is handling the keys.

> > You'd have a userspace controlled property to state what the userspace
> > wants, and a kernel controlled property to show the hardware
> > state. Userspace can ask for one or the other, and usually this happens,
> > but the hardware hotkey bypasses the whole thing.
> >
> > It's not perfect. But I think just one property changed nilly-willy by
> > both the kernel and userspace (especially when it's really not in the
> > kernel's full control either) is going to be a PITA.
> 
> Yeah that's what we've done in other cases where both kernel and
> userspace can change stuff. These where just tri-states, but this here
> sounds like we need all for, so best to just have 2 properties.

As I see it, the requirements here are:
 * Userspace needs to be able to query the current state
   (possibly unavailable?)
 * Userspace needs to know whether it can set the property
 * Userspace needs to be notified about changes
   (by the firmware or possibly any other reason)

But, should never get into the situation that both the firmware and
software are trying to toggle the state in response to the same event.
In the case of the Fn-key, we'll either deliver a key-press to
userspace or just update the attribute because the firmware has already
handled it. Only one of these two possibilities may happen.

Benjamin

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-17 14:17       ` Daniel Vetter
@ 2020-04-20  8:27         ` Pekka Paalanen
  2020-04-20 10:04           ` Pekka Paalanen
  2020-04-20 10:15           ` Simon Ser
  0 siblings, 2 replies; 54+ messages in thread
From: Pekka Paalanen @ 2020-04-20  8:27 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Benjamin Berg, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson


[-- Attachment #1.1: Type: text/plain, Size: 5981 bytes --]

On Fri, 17 Apr 2020 16:17:18 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:

> On Fri, Apr 17, 2020 at 11:02 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> >
> > Hi,
> >
> > let's think about how userspace uses atomic KMS UAPI. The simplest way
> > to use atomic correctly is that userspace will for every update send the
> > full, complete set of all properties that exist, both known and unknown
> > to userspace (to recover from temporarily VT-switching to another KMS
> > program that changes unknown properties). Attempting to track which
> > properties already have their correct values in the kernel is extra
> > work for just extra bugs.  
> 
> Uh if you do that you'll get random surprising failures if you don't
> also set ALLOW_MODESET, because that way you'll automatically repair
> link failures and stuff like that. I'm assuming your userspace only
> supplies all the properties for crtc and planes, and leaves connectors
> as-is? Otherwise you already have some fun bugs.
> 
> In general I'd say userspace shouldn't write stuff it doesn't
> understand. If you limit yourself to just the properties you do want
> to (re)set, that's safe. But if you just blindly write everything all
> the time, random modesets, and hence random failures if you don't set
> ALLOW_MODESET.

Hi,

how should userspace KMS program A recover from the situation when
switching the VT back from KMS program B who changed properties that
program A does not recognise? (I believe Weston does not recover at
the moment.) This is very important for getting e.g. reliable color
reproduction, since not all KMS programs are always up-to-date with
everything the kernel exposes and people may switch between them. Not
resetting everything may even encourage people to write hacks where you
temporarily VT-switch away, run a KMS program to set one property, and
then switch back assuming the property remains set. I have already seen
someone mention they can enable VRR behind the display server's back
like this.

I don't think Weston records and re-sets unknown properties yet, but I
assumed it is what it needs to do to be able to reliably recover from
VT-switches. In that case ALLOW_MODESET is of course set since all
state is unknown and assumed bad.

I do believe Weston re-submits *everything* it knows about every
update, except for CRTCs and connectors it has already disabled and
knows are in disabled state (this could change though).

However, during steady-state operation when ALLOW_MODESET should not be
necessary, is it still harmful to re-program *all* properties on every
update?

After all, the kernel will just no-op all property setting where the
value is already the right one, does it not?

The only "random" KMS state is the properties the userspace KMS
program does not know that are set on start-up. I have been assuming
that as long as you had fbdev active before the KMS program started,
the unknown properties have "harmless" default values. And maybe even at
driver device init if fbdev does not exist?

Is there something more up-to-date than
https://blog.ffwll.ch/2016/01/vt-switching-with-atomic-modeset.html ?

> > Assuming the property is userspace-writable: if kernel goes and
> > changes the property value on its own, it will very likely be just
> > overwritten by userspace right after if userspace does not manage to
> > process the uevent first. If that happens and userspace later
> > processes the uevent, userspace queries the kernel for the current
> > proprerty state which is now what userspace wrote, not what firmware
> > set.
> >
> > Therefore you end up with the firmware hotkey working only randomly.
> >
> > It would be much better to have the hotkey events delivered to
> > userspace so that userspace can control the privacy screen and
> > everything will be reliable, both the hotkeys and any GUI for it. The
> > other reliable option is that userspace must never be able to change
> > privacy screen state, only the hardware hotkeys can.  
> 
> We have fancy new uevents which give you both the connector and the
> property, so you know what's going on.

Yeah, userspace can know what changed, but not the new value with the
uevent.

> Also, a property which userspace and the kernel can race like you
> describe above is broken. We don't have these, and we won't merge
> them.

That's what I would expect, yes, but I'm not that optimistic that the
knowledgeable maintainers can always catch them all, which is why I try
to comment on UAPI additions that look fishy to me.

> The ones we do have the state transitions are a lot clearer, and
> userspace overwriting what the kernel has done is not actually going
> to cause a big pain. At least in the sense of the transition will be
> lost, since for e.g. both link_status and hdcp the value the kernel
> sets is not a value userspace can set. But it can result in problems
> if you just blindly write them again causing modesets you'd not
> expect.

Yeah, HDCP "Content Protection" is very carefully engineered to cope
with the awkwardness that both userspace and kernel will write it. It's
cumbersome.

Btw. I searched for all occurrences of link_status in
https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html and it seems it
only has two possible values, good and bad, and no mention whether it
is writable. Looks like it's writable. There does not seem to be a) an
explanation how exactly it needs to the handled (writing it does
something? what can you write?) or b) any way discern between kernel
and userspace set values like HDCP "Content Protection" has.

Weston does not support link_status yet, FWIW.

Unless you have some other idea on how to reset unknown KMS state to
sensible defaults that are always the same, I think any KMS property
where the query can return a value that cannot be written back to it
with ALLOW_MODESET is broken.


Thanks,
pq

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-20  8:27         ` Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support) Pekka Paalanen
@ 2020-04-20 10:04           ` Pekka Paalanen
  2020-04-20 10:18             ` Simon Ser
  2020-04-21 12:15             ` Daniel Vetter
  2020-04-20 10:15           ` Simon Ser
  1 sibling, 2 replies; 54+ messages in thread
From: Pekka Paalanen @ 2020-04-20 10:04 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Benjamin Berg, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson


[-- Attachment #1.1: Type: text/plain, Size: 5048 bytes --]

On Mon, 20 Apr 2020 11:27:04 +0300
Pekka Paalanen <ppaalanen@gmail.com> wrote:

> On Fri, 17 Apr 2020 16:17:18 +0200
> Daniel Vetter <daniel@ffwll.ch> wrote:
> 
> > On Fri, Apr 17, 2020 at 11:02 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:  
> > >
> > > Hi,
> > >
> > > let's think about how userspace uses atomic KMS UAPI. The simplest way
> > > to use atomic correctly is that userspace will for every update send the
> > > full, complete set of all properties that exist, both known and unknown
> > > to userspace (to recover from temporarily VT-switching to another KMS
> > > program that changes unknown properties). Attempting to track which
> > > properties already have their correct values in the kernel is extra
> > > work for just extra bugs.    
> > 
> > Uh if you do that you'll get random surprising failures if you don't
> > also set ALLOW_MODESET, because that way you'll automatically repair
> > link failures and stuff like that. I'm assuming your userspace only
> > supplies all the properties for crtc and planes, and leaves connectors
> > as-is? Otherwise you already have some fun bugs.
> > 
> > In general I'd say userspace shouldn't write stuff it doesn't
> > understand. If you limit yourself to just the properties you do want
> > to (re)set, that's safe. But if you just blindly write everything all
> > the time, random modesets, and hence random failures if you don't set
> > ALLOW_MODESET.  
> 
> Hi,
> 
> how should userspace KMS program A recover from the situation when
> switching the VT back from KMS program B who changed properties that
> program A does not recognise? (I believe Weston does not recover at
> the moment.) This is very important for getting e.g. reliable color
> reproduction, since not all KMS programs are always up-to-date with
> everything the kernel exposes and people may switch between them. Not
> resetting everything may even encourage people to write hacks where you
> temporarily VT-switch away, run a KMS program to set one property, and
> then switch back assuming the property remains set. I have already seen
> someone mention they can enable VRR behind the display server's back
> like this.
> 
> I don't think Weston records and re-sets unknown properties yet, but I
> assumed it is what it needs to do to be able to reliably recover from
> VT-switches. In that case ALLOW_MODESET is of course set since all
> state is unknown and assumed bad.
> 
> I do believe Weston re-submits *everything* it knows about every
> update, except for CRTCs and connectors it has already disabled and
> knows are in disabled state (this could change though).
> 
> However, during steady-state operation when ALLOW_MODESET should not be
> necessary, is it still harmful to re-program *all* properties on every
> update?
> 
> After all, the kernel will just no-op all property setting where the
> value is already the right one, does it not?
> 
> The only "random" KMS state is the properties the userspace KMS
> program does not know that are set on start-up. I have been assuming
> that as long as you had fbdev active before the KMS program started,
> the unknown properties have "harmless" default values. And maybe even at
> driver device init if fbdev does not exist?
> 
> Is there something more up-to-date than
> https://blog.ffwll.ch/2016/01/vt-switching-with-atomic-modeset.html ?

Thinking more, would the below work?

Actor: a KMS userspace program, e.g. a display server

- On start-up, read all KMS properties and their values. The properties
  that are not recognised are saved in a set called "reset unknowns"
  with their current values.

  Optional: The program commits the "reset unknown" state to KMS with
  ALLOW_MODESET to ensure it all is writable as is; if that fails,
  there is no guarantee that the program could recover later, so it's
  best to abort in that case. This could be part of the initial
  modeset, too.

- When the program has lost and regained DRM master status, meaning
  that (unrecognised) KMS state is potentially incorrect, prepare an
  atomic commit with "reset unknowns" set and add all the recognised
  state the program knows of on top. This resets everything to like it
  was, with ALLOW_MODESET.

- At any other time, do not use the "reset unknowns" set.

The final point is the important one. I have assumed it would be safe
to use always, but apparently not? Good thing I haven't yet written
code to do that.

You have to recognise the property to know if it is safe to set
needlessly (for convenience in both code simplicity and ease of
debugging)?

Also, when using "reset unknowns" set, it actually has to be
partitioned by KMS objects (CRTC, connector, plane...) so if e.g. a
connector no longer exist, you don't attempt to set it.

However, this still leaves writable properties whose value read is not
legal to write as broken. Let's pray that fbcon or a system compositor
will never succeed in enabling HDCP...


Thanks,
pq

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-20  8:27         ` Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support) Pekka Paalanen
  2020-04-20 10:04           ` Pekka Paalanen
@ 2020-04-20 10:15           ` Simon Ser
  2020-04-20 12:22             ` Pekka Paalanen
  1 sibling, 1 reply; 54+ messages in thread
From: Simon Ser @ 2020-04-20 10:15 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Benjamin Berg, David Airlie, Thomas Zimmermann,
	Christian Kellner, Javier Martinez Canillas, dri-devel,
	Hans de Goede, Mark Pearson, Rajat Jain, Nitin Joshi1

On Monday, April 20, 2020 10:27 AM, Pekka Paalanen <ppaalanen@gmail.com> wrote:

> The only "random" KMS state is the properties the userspace KMS
> program does not know that are set on start-up. I have been assuming
> that as long as you had fbdev active before the KMS program started,
> the unknown properties have "harmless" default values. And maybe even at
> driver device init if fbdev does not exist?

Note, this is not the case when using e.g. a display manager. In the
past there have been cases of a display manager setting a hw cursor
and launching a compositor not supporting hw cursors. This results in
a stuck hw cursor.

> Btw. I searched for all occurrences of link_status in
> https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html and it seems it
> only has two possible values, good and bad, and no mention whether it
> is writable. Looks like it's writable. There does not seem to be a) an
> explanation how exactly it needs to the handled (writing it does
> something? what can you write?) or b) any way discern between kernel
> and userspace set values like HDCP "Content Protection" has.

User-space needs to reset the value to GOOD when recovering from a BAD
value.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-20 10:04           ` Pekka Paalanen
@ 2020-04-20 10:18             ` Simon Ser
  2020-04-21 12:15             ` Daniel Vetter
  1 sibling, 0 replies; 54+ messages in thread
From: Simon Ser @ 2020-04-20 10:18 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Benjamin Berg, David Airlie, Thomas Zimmermann,
	Christian Kellner, Javier Martinez Canillas, dri-devel,
	Hans de Goede, Mark Pearson, Rajat Jain, Nitin Joshi1

I'd really like a drmModeAtomicAddDefaultProperty that resets a prop
to its default value…
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-20 10:15           ` Simon Ser
@ 2020-04-20 12:22             ` Pekka Paalanen
  2020-04-20 12:33               ` Simon Ser
  0 siblings, 1 reply; 54+ messages in thread
From: Pekka Paalanen @ 2020-04-20 12:22 UTC (permalink / raw)
  To: Simon Ser
  Cc: Benjamin Berg, David Airlie, Thomas Zimmermann,
	Christian Kellner, Javier Martinez Canillas, dri-devel,
	Hans de Goede, Mark Pearson, Rajat Jain, Nitin Joshi1


[-- Attachment #1.1: Type: text/plain, Size: 2233 bytes --]

On Mon, 20 Apr 2020 10:15:39 +0000
Simon Ser <contact@emersion.fr> wrote:

> On Monday, April 20, 2020 10:27 AM, Pekka Paalanen <ppaalanen@gmail.com> wrote:
> 
> > The only "random" KMS state is the properties the userspace KMS
> > program does not know that are set on start-up. I have been assuming
> > that as long as you had fbdev active before the KMS program started,
> > the unknown properties have "harmless" default values. And maybe even at
> > driver device init if fbdev does not exist?  

I meant fbcon, not fbdev above.

> Note, this is not the case when using e.g. a display manager. In the
> past there have been cases of a display manager setting a hw cursor
> and launching a compositor not supporting hw cursors. This results in
> a stuck hw cursor.

Indeed. So the display manager might get sensible defaults, but the
session compositor might not. Or maybe boot splash uses KMS already, so
even display manager doesn't get all-defaults state.

It seems we really do need "sane defaults" from the kernel explicitly.
Writing a userspace tool to save it at boot time before any KMS program
runs would be awkward.

> > Btw. I searched for all occurrences of link_status in
> > https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html and it seems it
> > only has two possible values, good and bad, and no mention whether it
> > is writable. Looks like it's writable. There does not seem to be a) an
> > explanation how exactly it needs to the handled (writing it does
> > something? what can you write?) or b) any way discern between kernel
> > and userspace set values like HDCP "Content Protection" has.  
> 
> User-space needs to reset the value to GOOD when recovering from a BAD
> value.

What if userspace writes BAD?

BAD cannot be default state, so getting default state from somewhere
would solve this property's restoring as well. Reading back the true
current value could accidentally return BAD.


Just to reiterate for everyone, the important thing here is to figure
out how userspace is supposed to reset unknown properties to sensible
defaults. Once we know how that should work, we can review whether new
properties support or break that.


Thanks,
pq

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-20 12:22             ` Pekka Paalanen
@ 2020-04-20 12:33               ` Simon Ser
  0 siblings, 0 replies; 54+ messages in thread
From: Simon Ser @ 2020-04-20 12:33 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Benjamin Berg, David Airlie, Thomas Zimmermann,
	Christian Kellner, Javier Martinez Canillas, dri-devel,
	Hans de Goede, Mark Pearson, Rajat Jain, Nitin Joshi1

On Monday, April 20, 2020 2:22 PM, Pekka Paalanen <ppaalanen@gmail.com> wrote:

> On Mon, 20 Apr 2020 10:15:39 +0000
> Simon Ser contact@emersion.fr wrote:
>
> > On Monday, April 20, 2020 10:27 AM, Pekka Paalanen ppaalanen@gmail.com wrote:
> >
> > > The only "random" KMS state is the properties the userspace KMS
> > > program does not know that are set on start-up. I have been assuming
> > > that as long as you had fbdev active before the KMS program started,
> > > the unknown properties have "harmless" default values. And maybe even at
> > > driver device init if fbdev does not exist?
>
> I meant fbcon, not fbdev above.
>
> > Note, this is not the case when using e.g. a display manager. In the
> > past there have been cases of a display manager setting a hw cursor
> > and launching a compositor not supporting hw cursors. This results in
> > a stuck hw cursor.
>
> Indeed. So the display manager might get sensible defaults, but the
> session compositor might not. Or maybe boot splash uses KMS already, so
> even display manager doesn't get all-defaults state.
>
> It seems we really do need "sane defaults" from the kernel explicitly.
> Writing a userspace tool to save it at boot time before any KMS program
> runs would be awkward.

Agreed.

> > > Btw. I searched for all occurrences of link_status in
> > > https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html and it seems it
> > > only has two possible values, good and bad, and no mention whether it
> > > is writable. Looks like it's writable. There does not seem to be a) an
> > > explanation how exactly it needs to the handled (writing it does
> > > something? what can you write?) or b) any way discern between kernel
> > > and userspace set values like HDCP "Content Protection" has.
> >
> > User-space needs to reset the value to GOOD when recovering from a BAD
> > value.
>
> What if userspace writes BAD?
>
> BAD cannot be default state, so getting default state from somewhere
> would solve this property's restoring as well. Reading back the true
> current value could accidentally return BAD.


Interestingly, setting it to BAD is a no-op (BAD is silently
discarded):

	/* Never downgrade from GOOD to BAD on userspace's request here,
	 * only hw issues can do that.
	 *
	 * For an atomic property the userspace doesn't need to be able
	 * to understand all the properties, but needs to be able to
	 * restore the state it wants on VT switch. So if the userspace
	 * tries to change the link_status from GOOD to BAD, driver
	 * silently rejects it and returns a 0. This prevents userspace
	 * from accidently breaking  the display when it restores the
	 * state.
	 */
	if (state->link_status != DRM_LINK_STATUS_GOOD)
		state->link_status = val;

So restoring the "sane default" would be work, even if the link happens
to be BAD when saving said "sane defaults".

> Just to reiterate for everyone, the important thing here is to figure
> out how userspace is supposed to reset unknown properties to sensible
> defaults. Once we know how that should work, we can review whether new
> properties support or break that.

Yes, that's a good description of the problem.

I see two main solutions here: either the kernel provides the default
values in its property descriptions (e.g. drmModeGetProperty), either
user-space can ask the kernel to reset properties to their default
values.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-20 10:04           ` Pekka Paalanen
  2020-04-20 10:18             ` Simon Ser
@ 2020-04-21 12:15             ` Daniel Vetter
  2020-04-21 14:33               ` Pekka Paalanen
  1 sibling, 1 reply; 54+ messages in thread
From: Daniel Vetter @ 2020-04-21 12:15 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Thomas Zimmermann, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Nitin Joshi1,
	Rajat Jain, Mark Pearson, Benjamin Berg

On Mon, Apr 20, 2020 at 01:04:20PM +0300, Pekka Paalanen wrote:
> On Mon, 20 Apr 2020 11:27:04 +0300
> Pekka Paalanen <ppaalanen@gmail.com> wrote:
> 
> > On Fri, 17 Apr 2020 16:17:18 +0200
> > Daniel Vetter <daniel@ffwll.ch> wrote:
> > 
> > > On Fri, Apr 17, 2020 at 11:02 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:  
> > > >
> > > > Hi,
> > > >
> > > > let's think about how userspace uses atomic KMS UAPI. The simplest way
> > > > to use atomic correctly is that userspace will for every update send the
> > > > full, complete set of all properties that exist, both known and unknown
> > > > to userspace (to recover from temporarily VT-switching to another KMS
> > > > program that changes unknown properties). Attempting to track which
> > > > properties already have their correct values in the kernel is extra
> > > > work for just extra bugs.    
> > > 
> > > Uh if you do that you'll get random surprising failures if you don't
> > > also set ALLOW_MODESET, because that way you'll automatically repair
> > > link failures and stuff like that. I'm assuming your userspace only
> > > supplies all the properties for crtc and planes, and leaves connectors
> > > as-is? Otherwise you already have some fun bugs.
> > > 
> > > In general I'd say userspace shouldn't write stuff it doesn't
> > > understand. If you limit yourself to just the properties you do want
> > > to (re)set, that's safe. But if you just blindly write everything all
> > > the time, random modesets, and hence random failures if you don't set
> > > ALLOW_MODESET.  
> > 
> > Hi,
> > 
> > how should userspace KMS program A recover from the situation when
> > switching the VT back from KMS program B who changed properties that
> > program A does not recognise? (I believe Weston does not recover at
> > the moment.) This is very important for getting e.g. reliable color
> > reproduction, since not all KMS programs are always up-to-date with
> > everything the kernel exposes and people may switch between them. Not
> > resetting everything may even encourage people to write hacks where you
> > temporarily VT-switch away, run a KMS program to set one property, and
> > then switch back assuming the property remains set. I have already seen
> > someone mention they can enable VRR behind the display server's back
> > like this.
> > 
> > I don't think Weston records and re-sets unknown properties yet, but I
> > assumed it is what it needs to do to be able to reliably recover from
> > VT-switches. In that case ALLOW_MODESET is of course set since all
> > state is unknown and assumed bad.
> > 
> > I do believe Weston re-submits *everything* it knows about every
> > update, except for CRTCs and connectors it has already disabled and
> > knows are in disabled state (this could change though).
> > 
> > However, during steady-state operation when ALLOW_MODESET should not be
> > necessary, is it still harmful to re-program *all* properties on every
> > update?
> > 
> > After all, the kernel will just no-op all property setting where the
> > value is already the right one, does it not?
> > 
> > The only "random" KMS state is the properties the userspace KMS
> > program does not know that are set on start-up. I have been assuming
> > that as long as you had fbdev active before the KMS program started,
> > the unknown properties have "harmless" default values. And maybe even at
> > driver device init if fbdev does not exist?
> > 
> > Is there something more up-to-date than
> > https://blog.ffwll.ch/2016/01/vt-switching-with-atomic-modeset.html ?

Sadly, nothing changed since then.

> Thinking more, would the below work?

Yup, this would somewhat work. Except not always, I've found one case
where even this goes wrong:

- Content-protection property, if enabled, has the kernel automatically
  switch to enabled if hdcp is actually on and authenticated and all that.
  Writing back that enabled value will fail. But there's good chances that
  at boot-up content protection isn't enabled, so should work out nicely.

- We could fix this by silently downcasting enabled to requested, but
  might still lead to surprises since it makes hdcp rather more sticky
  than some users might like.

> Actor: a KMS userspace program, e.g. a display server
> 
> - On start-up, read all KMS properties and their values. The properties
>   that are not recognised are saved in a set called "reset unknowns"
>   with their current values.
> 
>   Optional: The program commits the "reset unknown" state to KMS with
>   ALLOW_MODESET to ensure it all is writable as is; if that fails,
>   there is no guarantee that the program could recover later, so it's
>   best to abort in that case. This could be part of the initial
>   modeset, too.
> 
> - When the program has lost and regained DRM master status, meaning
>   that (unrecognised) KMS state is potentially incorrect, prepare an
>   atomic commit with "reset unknowns" set and add all the recognised
>   state the program knows of on top. This resets everything to like it
>   was, with ALLOW_MODESET.
> 
> - At any other time, do not use the "reset unknowns" set.
> 
> The final point is the important one. I have assumed it would be safe
> to use always, but apparently not? Good thing I haven't yet written
> code to do that.
> 
> You have to recognise the property to know if it is safe to set
> needlessly (for convenience in both code simplicity and ease of
> debugging)?
> 
> Also, when using "reset unknowns" set, it actually has to be
> partitioned by KMS objects (CRTC, connector, plane...) so if e.g. a
> connector no longer exist, you don't attempt to set it.
> 
> However, this still leaves writable properties whose value read is not
> legal to write as broken. Let's pray that fbcon or a system compositor
> will never succeed in enabling HDCP...

Note that the kernel isn't entire consistent on this. I've looked a bit
more closely at stuff. Ignoring content protection I've found following
approaches things:

- self refresh helpers, which are entirely transparent. Therefore we do a
  hack to set allow_modeset when the self-refresh helpers need to do a
  modeset, to avoid total surprise for userspace. I think this is only ok
  for these kind of behind-the-scenes helpers like self-refresh.

- link-status is always reset to "good" when you include any connector,
  which might force a modeset. Even when allow_modeset isn't set by
  userspace. Maybe we should fix that, but we've discussed forever how to
  make sure a bad link isn't ever stuck at "bad" for old userspace, so
  we've gone with this. But maybe limiting to only allow_modeset cases
  would also work.

- I guess we could do stuff that only fires off when allow_modeset is set,
  but I haven't found some examples. I thought we've had some outside of
  self-refresh helpers already. The closest I've found is again
  link-status, which never allows userspace to set BAD, and silently
  upgrades to GOOD. So that userspace doing a blind safe/restore can't
  wreak stuff permanently.

It's all a bit nasty :-/

I think we should at least allow userspace to do a blind restore with
allow_modeset and not expect bad side-effects. That would mean fixing at
least the content protection stuff.

Plus documenting this in the kernel somewhere. As the official thing to
do. But maybe we want some actual userspace doing this before we enshrine
it as official policy. The content protection fix is a one-liner and can
be cc'ed stable.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-17 11:55       ` Jani Nikula
  2020-04-17 14:18         ` Daniel Vetter
@ 2020-04-21 12:37         ` Hans de Goede
  2020-04-21 12:40           ` Daniel Vetter
  2020-04-21 14:46           ` Pekka Paalanen
  1 sibling, 2 replies; 54+ messages in thread
From: Hans de Goede @ 2020-04-21 12:37 UTC (permalink / raw)
  To: Jani Nikula, Pekka Paalanen, Mark Pearson
  Cc: Thomas Zimmermann, David Airlie, Christian Kellner,
	Javier Martinez Canillas, dri-devel, Rajat Jain, Nitin Joshi1,
	Benjamin Berg

Hi,

On 4/17/20 1:55 PM, Jani Nikula wrote:
> On Fri, 17 Apr 2020, Pekka Paalanen <ppaalanen@gmail.com> wrote:
>> On Wed, 15 Apr 2020 17:40:46 +0200
>> Hans de Goede <hdegoede@redhat.com> wrote:
>>
>>> Hi,
>>>
>>> On 4/15/20 5:28 PM, Jani Nikula wrote:
>>>> On Wed, 15 Apr 2020, Hans de Goede <hdegoede@redhat.com> wrote:
>>>>> ii. Currently the "privacy-screen" property added by Rajat's
>>>>> patch-set is an enum with 2 possible values:
>>>>> "Enabled"
>>>>> "Disabled"
>>>>>
>>>>> We could add a third value "Not Available", which would be the
>>>>> default and then for internal panels always add the property
>>>>> so that we avoid the problem that detecting if the laptop has
>>>>> an internal privacy screen needs to be done before the connector
>>>>> is registered. Then we can add some hooks which allow an
>>>>> lcdshadow-driver to register itself against a connector later
>>>>> (which is non trivial wrt probe order, but lets ignore that for now).
>>>>
>>>> I regret dropping the ball on Rajat's series (sorry!).
>>>>
>>>> I do think having the connector property for this is the way to go.
>>>
>>> I 100% agree.
>>>
>>>> Even
>>>> if we couldn't necessarily figure out all the details on the kernel
>>>> internal connections, can we settle on the property though, so we could
>>>> move forward with Rajat's series?
>>>
>>> Yes please, this will also allow us to move forward with userspace
>>> support even if for testing that we do some hacks for the kernel's
>>> internal connections for now.
>>>
>>>> Moreover, do we actually need two properties, one which could indicate
>>>> userspace's desire for the property, and another that tells the hardware
>>>> state?
>>>
>>> No I do not think so. I would expect there to just be one property,
>>> I guess that if the state is (partly) firmware controlled then there
>>> might be a race, but we will need a notification mechanism (*) for
>>> firmware triggered state changes anyways, so shortly after loosing
>>> the race userspace will process the notification and it will know
>>> about it.
>>>
>>> One thing which might be useful is a way to signal that the property
>>> is read-only in case we ever hit hw where that is the case.
>>>
>>>> I'd so very much like to have no in-kernel/in-firmware shortcuts
>>>> to enable/disable the privacy screen, and instead have any hardware
>>>> buttons just be events that the userspace could react to. However I
>>>> don't think that'll be the case unfortunately.
>>>
>>> In my experience with keyboard-backlight support, we will (unfortunately)
>>> see a mix and in some case we will get a notification that the firmware
>>> has adjusted the state, rather then just getting a keypress and
>>> dealing with that ourselves.  In some cases we may even be able to
>>> choose, so the fw will deal with it by default but we can ask it
>>> to just send a key-press.  But I do believe that we can *not* expect
>>> that we will always just get a keypress for userspace to deal with.
>>
>> Hi,
>>
>> let's think about how userspace uses atomic KMS UAPI. The simplest way
>> to use atomic correctly is that userspace will for every update send the
>> full, complete set of all properties that exist, both known and unknown
>> to userspace (to recover from temporarily VT-switching to another KMS
>> program that changes unknown properties). Attempting to track which
>> properties already have their correct values in the kernel is extra
>> work for just extra bugs.
>>
>> Assuming the property is userspace-writable: if kernel goes and
>> changes the property value on its own, it will very likely be just
>> overwritten by userspace right after if userspace does not manage to
>> process the uevent first. If that happens and userspace later
>> processes the uevent, userspace queries the kernel for the current
>> proprerty state which is now what userspace wrote, not what firmware
>> set.
>>
>> Therefore you end up with the firmware hotkey working only randomly.
>>
>> It would be much better to have the hotkey events delivered to
>> userspace so that userspace can control the privacy screen and
>> everything will be reliable, both the hotkeys and any GUI for it.
> 
> I'd like this too. However I fear this is out of our control, and OEMs
> have and will anyway fiddle with the privacy screen directly no matter
> what we say, and we can't prevent that. From their POV it's easier for
> them to do their value-add in components they have total control over. I
> emphatize with that view, even if it's counter-productive from the Linux
> ecosystem POV.
> 
> So we'll just have to deal with it.

Ack, at least that is the case for the current generation Lenovo devices.

>> The other reliable option is that userspace must never be able to
>> change privacy screen state, only the hardware hotkeys can.
> 
> That, in turn, discourages anyone from doing the right thing, and blocks
> us from adding any nice additional features for privacy screens that
> only the userspace is capable of managing. For example, controlling
> privacy screen based on content, which seems like an obvious feature.

Right.

So we have the case here were both the firmware and userspace may change
the privacyscreen state (on/off) at any time.

This means that the atomic API use described by Pekka for this, where
userspace keeps all properties in memory, updates the one which it
wants to change and then commits simply cannot work here.

Userspace should only include this property in the transaction if
it actually wants to change it. Or it should do a read of it
and update its internal state before committing that state unless
it wants to change it. But always blindly committing the last value
used by userspace will simply not work, because then there is no
way for the kernel to know if userspace is just repeating itself
or actually wants to change the property.

As for the one vs two properties. For the current hw userspace can
always change the state at any time. But I can envision devices
where there is a hardware override forcing the privacy screen to
be on. So I guess that a r/w "software state" + a ro "hardware state"
property would make sense. Note that with the current gen. Lenovo
devices the hw-state will simply be a mirror of the sw-state and
the sw-state will be toggled by the firmware independently of
the last value requested by userspace.

I guess we could add a third ro firmware-state property, which
would reflect the last firmware requested value, but I cannot
really come up with clear semantics for this; nor can I come
up with how this actually helps.

TL;DR: Yes there will be races, because of both userspace +
the firmware having; and potentially using r/w access to
the privacy-screen state. But in practice I expect these
to not really be an issue. Important here is that userspace
only commits the property in a transaction to commit if
it actually intends to change the property so as to not
needlessly create a situation where we might hit the race.

As for 1 vs 2 properties for this I guess that in preparation
for potential devices where the state is locked, having a
r/w sw-state + a ro hw-state property makes sense.

So I suggest that we replace the current "privacy-screen" property
from Rajat's patch-set with 2 props named:

"privacy-screen-sw-state" (r/w)
"privacy-screen-hw-state" (ro)

Where for current gen hardware the privacy-screen-hw-state is
just a mirror of the sw-state.

Regards,

Hans







_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-21 12:37         ` Hans de Goede
@ 2020-04-21 12:40           ` Daniel Vetter
  2020-04-21 14:46           ` Pekka Paalanen
  1 sibling, 0 replies; 54+ messages in thread
From: Daniel Vetter @ 2020-04-21 12:40 UTC (permalink / raw)
  To: Hans de Goede
  Cc: David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Thomas Zimmermann, Nitin Joshi1,
	Rajat Jain, Mark Pearson, Benjamin Berg

On Tue, Apr 21, 2020 at 02:37:41PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 4/17/20 1:55 PM, Jani Nikula wrote:
> > On Fri, 17 Apr 2020, Pekka Paalanen <ppaalanen@gmail.com> wrote:
> > > On Wed, 15 Apr 2020 17:40:46 +0200
> > > Hans de Goede <hdegoede@redhat.com> wrote:
> > > 
> > > > Hi,
> > > > 
> > > > On 4/15/20 5:28 PM, Jani Nikula wrote:
> > > > > On Wed, 15 Apr 2020, Hans de Goede <hdegoede@redhat.com> wrote:
> > > > > > ii. Currently the "privacy-screen" property added by Rajat's
> > > > > > patch-set is an enum with 2 possible values:
> > > > > > "Enabled"
> > > > > > "Disabled"
> > > > > > 
> > > > > > We could add a third value "Not Available", which would be the
> > > > > > default and then for internal panels always add the property
> > > > > > so that we avoid the problem that detecting if the laptop has
> > > > > > an internal privacy screen needs to be done before the connector
> > > > > > is registered. Then we can add some hooks which allow an
> > > > > > lcdshadow-driver to register itself against a connector later
> > > > > > (which is non trivial wrt probe order, but lets ignore that for now).
> > > > > 
> > > > > I regret dropping the ball on Rajat's series (sorry!).
> > > > > 
> > > > > I do think having the connector property for this is the way to go.
> > > > 
> > > > I 100% agree.
> > > > 
> > > > > Even
> > > > > if we couldn't necessarily figure out all the details on the kernel
> > > > > internal connections, can we settle on the property though, so we could
> > > > > move forward with Rajat's series?
> > > > 
> > > > Yes please, this will also allow us to move forward with userspace
> > > > support even if for testing that we do some hacks for the kernel's
> > > > internal connections for now.
> > > > 
> > > > > Moreover, do we actually need two properties, one which could indicate
> > > > > userspace's desire for the property, and another that tells the hardware
> > > > > state?
> > > > 
> > > > No I do not think so. I would expect there to just be one property,
> > > > I guess that if the state is (partly) firmware controlled then there
> > > > might be a race, but we will need a notification mechanism (*) for
> > > > firmware triggered state changes anyways, so shortly after loosing
> > > > the race userspace will process the notification and it will know
> > > > about it.
> > > > 
> > > > One thing which might be useful is a way to signal that the property
> > > > is read-only in case we ever hit hw where that is the case.
> > > > 
> > > > > I'd so very much like to have no in-kernel/in-firmware shortcuts
> > > > > to enable/disable the privacy screen, and instead have any hardware
> > > > > buttons just be events that the userspace could react to. However I
> > > > > don't think that'll be the case unfortunately.
> > > > 
> > > > In my experience with keyboard-backlight support, we will (unfortunately)
> > > > see a mix and in some case we will get a notification that the firmware
> > > > has adjusted the state, rather then just getting a keypress and
> > > > dealing with that ourselves.  In some cases we may even be able to
> > > > choose, so the fw will deal with it by default but we can ask it
> > > > to just send a key-press.  But I do believe that we can *not* expect
> > > > that we will always just get a keypress for userspace to deal with.
> > > 
> > > Hi,
> > > 
> > > let's think about how userspace uses atomic KMS UAPI. The simplest way
> > > to use atomic correctly is that userspace will for every update send the
> > > full, complete set of all properties that exist, both known and unknown
> > > to userspace (to recover from temporarily VT-switching to another KMS
> > > program that changes unknown properties). Attempting to track which
> > > properties already have their correct values in the kernel is extra
> > > work for just extra bugs.
> > > 
> > > Assuming the property is userspace-writable: if kernel goes and
> > > changes the property value on its own, it will very likely be just
> > > overwritten by userspace right after if userspace does not manage to
> > > process the uevent first. If that happens and userspace later
> > > processes the uevent, userspace queries the kernel for the current
> > > proprerty state which is now what userspace wrote, not what firmware
> > > set.
> > > 
> > > Therefore you end up with the firmware hotkey working only randomly.
> > > 
> > > It would be much better to have the hotkey events delivered to
> > > userspace so that userspace can control the privacy screen and
> > > everything will be reliable, both the hotkeys and any GUI for it.
> > 
> > I'd like this too. However I fear this is out of our control, and OEMs
> > have and will anyway fiddle with the privacy screen directly no matter
> > what we say, and we can't prevent that. From their POV it's easier for
> > them to do their value-add in components they have total control over. I
> > emphatize with that view, even if it's counter-productive from the Linux
> > ecosystem POV.
> > 
> > So we'll just have to deal with it.
> 
> Ack, at least that is the case for the current generation Lenovo devices.
> 
> > > The other reliable option is that userspace must never be able to
> > > change privacy screen state, only the hardware hotkeys can.
> > 
> > That, in turn, discourages anyone from doing the right thing, and blocks
> > us from adding any nice additional features for privacy screens that
> > only the userspace is capable of managing. For example, controlling
> > privacy screen based on content, which seems like an obvious feature.
> 
> Right.
> 
> So we have the case here were both the firmware and userspace may change
> the privacyscreen state (on/off) at any time.
> 
> This means that the atomic API use described by Pekka for this, where
> userspace keeps all properties in memory, updates the one which it
> wants to change and then commits simply cannot work here.
> 
> Userspace should only include this property in the transaction if
> it actually wants to change it. Or it should do a read of it
> and update its internal state before committing that state unless
> it wants to change it. But always blindly committing the last value
> used by userspace will simply not work, because then there is no
> way for the kernel to know if userspace is just repeating itself
> or actually wants to change the property.
> 
> As for the one vs two properties. For the current hw userspace can
> always change the state at any time. But I can envision devices
> where there is a hardware override forcing the privacy screen to
> be on. So I guess that a r/w "software state" + a ro "hardware state"
> property would make sense. Note that with the current gen. Lenovo
> devices the hw-state will simply be a mirror of the sw-state and
> the sw-state will be toggled by the firmware independently of
> the last value requested by userspace.
> 
> I guess we could add a third ro firmware-state property, which
> would reflect the last firmware requested value, but I cannot
> really come up with clear semantics for this; nor can I come
> up with how this actually helps.
> 
> TL;DR: Yes there will be races, because of both userspace +
> the firmware having; and potentially using r/w access to
> the privacy-screen state. But in practice I expect these
> to not really be an issue. Important here is that userspace
> only commits the property in a transaction to commit if
> it actually intends to change the property so as to not
> needlessly create a situation where we might hit the race.
> 
> As for 1 vs 2 properties for this I guess that in preparation
> for potential devices where the state is locked, having a
> r/w sw-state + a ro hw-state property makes sense.
> 
> So I suggest that we replace the current "privacy-screen" property
> from Rajat's patch-set with 2 props named:
> 
> "privacy-screen-sw-state" (r/w)
> "privacy-screen-hw-state" (ro)
> 
> Where for current gen hardware the privacy-screen-hw-state is
> just a mirror of the sw-state.

Yup. I think internally we should have some kind of explicit update
function perhaps, that's at least how immutable (by userspace) properties
work. Or we'll change that to a callback, so that there's no inversion
going on.
-Daniel

> 
> Regards,
> 
> Hans
> 
> 
> 
> 
> 
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-21 12:15             ` Daniel Vetter
@ 2020-04-21 14:33               ` Pekka Paalanen
  2020-04-21 14:39                 ` Simon Ser
  2020-04-23 15:01                 ` Daniel Vetter
  0 siblings, 2 replies; 54+ messages in thread
From: Pekka Paalanen @ 2020-04-21 14:33 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Benjamin Berg, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson


[-- Attachment #1.1: Type: text/plain, Size: 9400 bytes --]

On Tue, 21 Apr 2020 14:15:52 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:

> On Mon, Apr 20, 2020 at 01:04:20PM +0300, Pekka Paalanen wrote:
> > On Mon, 20 Apr 2020 11:27:04 +0300
> > Pekka Paalanen <ppaalanen@gmail.com> wrote:
> >   
> > > On Fri, 17 Apr 2020 16:17:18 +0200
> > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > >   
> > > > On Fri, Apr 17, 2020 at 11:02 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:    
> > > > >
> > > > > Hi,
> > > > >
> > > > > let's think about how userspace uses atomic KMS UAPI. The simplest way
> > > > > to use atomic correctly is that userspace will for every update send the
> > > > > full, complete set of all properties that exist, both known and unknown
> > > > > to userspace (to recover from temporarily VT-switching to another KMS
> > > > > program that changes unknown properties). Attempting to track which
> > > > > properties already have their correct values in the kernel is extra
> > > > > work for just extra bugs.      
> > > > 
> > > > Uh if you do that you'll get random surprising failures if you don't
> > > > also set ALLOW_MODESET, because that way you'll automatically repair
> > > > link failures and stuff like that. I'm assuming your userspace only
> > > > supplies all the properties for crtc and planes, and leaves connectors
> > > > as-is? Otherwise you already have some fun bugs.
> > > > 
> > > > In general I'd say userspace shouldn't write stuff it doesn't
> > > > understand. If you limit yourself to just the properties you do want
> > > > to (re)set, that's safe. But if you just blindly write everything all
> > > > the time, random modesets, and hence random failures if you don't set
> > > > ALLOW_MODESET.    
> > > 
> > > Hi,
> > > 
> > > how should userspace KMS program A recover from the situation when
> > > switching the VT back from KMS program B who changed properties that
> > > program A does not recognise? (I believe Weston does not recover at
> > > the moment.) This is very important for getting e.g. reliable color
> > > reproduction, since not all KMS programs are always up-to-date with
> > > everything the kernel exposes and people may switch between them. Not
> > > resetting everything may even encourage people to write hacks where you
> > > temporarily VT-switch away, run a KMS program to set one property, and
> > > then switch back assuming the property remains set. I have already seen
> > > someone mention they can enable VRR behind the display server's back
> > > like this.
> > > 
> > > I don't think Weston records and re-sets unknown properties yet, but I
> > > assumed it is what it needs to do to be able to reliably recover from
> > > VT-switches. In that case ALLOW_MODESET is of course set since all
> > > state is unknown and assumed bad.
> > > 
> > > I do believe Weston re-submits *everything* it knows about every
> > > update, except for CRTCs and connectors it has already disabled and
> > > knows are in disabled state (this could change though).
> > > 
> > > However, during steady-state operation when ALLOW_MODESET should not be
> > > necessary, is it still harmful to re-program *all* properties on every
> > > update?
> > > 
> > > After all, the kernel will just no-op all property setting where the
> > > value is already the right one, does it not?
> > > 
> > > The only "random" KMS state is the properties the userspace KMS
> > > program does not know that are set on start-up. I have been assuming
> > > that as long as you had fbdev active before the KMS program started,
> > > the unknown properties have "harmless" default values. And maybe even at
> > > driver device init if fbdev does not exist?
> > > 
> > > Is there something more up-to-date than
> > > https://blog.ffwll.ch/2016/01/vt-switching-with-atomic-modeset.html ?  
> 
> Sadly, nothing changed since then.
> 
> > Thinking more, would the below work?  
> 
> Yup, this would somewhat work. Except not always, I've found one case
> where even this goes wrong:
> 
> - Content-protection property, if enabled, has the kernel automatically
>   switch to enabled if hdcp is actually on and authenticated and all that.
>   Writing back that enabled value will fail. But there's good chances that
>   at boot-up content protection isn't enabled, so should work out nicely.
> 
> - We could fix this by silently downcasting enabled to requested, but
>   might still lead to surprises since it makes hdcp rather more sticky
>   than some users might like.

The fix doesn't make HDCP any more or less sticky, it just makes it
possible to not fail a resetting atomic commit. Without a resetting
commit, it will remain DESIRED/ENABLED.

If "Content Protection" reads back as DESIRED, userspace that relies on
read-back for reset will reset it to DESIRED. Could as well be ENABLED
that the kernel just takes as DESIRED when written.

> > Actor: a KMS userspace program, e.g. a display server
> > 
> > - On start-up, read all KMS properties and their values. The properties
> >   that are not recognised are saved in a set called "reset unknowns"
> >   with their current values.
> > 
> >   Optional: The program commits the "reset unknown" state to KMS with
> >   ALLOW_MODESET to ensure it all is writable as is; if that fails,
> >   there is no guarantee that the program could recover later, so it's
> >   best to abort in that case. This could be part of the initial
> >   modeset, too.
> > 
> > - When the program has lost and regained DRM master status, meaning
> >   that (unrecognised) KMS state is potentially incorrect, prepare an
> >   atomic commit with "reset unknowns" set and add all the recognised
> >   state the program knows of on top. This resets everything to like it
> >   was, with ALLOW_MODESET.
> > 
> > - At any other time, do not use the "reset unknowns" set.
> > 
> > The final point is the important one. I have assumed it would be safe
> > to use always, but apparently not? Good thing I haven't yet written
> > code to do that.
> > 
> > You have to recognise the property to know if it is safe to set
> > needlessly (for convenience in both code simplicity and ease of
> > debugging)?
> > 
> > Also, when using "reset unknowns" set, it actually has to be
> > partitioned by KMS objects (CRTC, connector, plane...) so if e.g. a
> > connector no longer exist, you don't attempt to set it.
> > 
> > However, this still leaves writable properties whose value read is not
> > legal to write as broken. Let's pray that fbcon or a system compositor
> > will never succeed in enabling HDCP...  
> 
> Note that the kernel isn't entire consistent on this. I've looked a bit
> more closely at stuff. Ignoring content protection I've found following
> approaches things:
> 
> - self refresh helpers, which are entirely transparent. Therefore we do a
>   hack to set allow_modeset when the self-refresh helpers need to do a
>   modeset, to avoid total surprise for userspace. I think this is only ok
>   for these kind of behind-the-scenes helpers like self-refresh.
> 
> - link-status is always reset to "good" when you include any connector,
>   which might force a modeset. Even when allow_modeset isn't set by
>   userspace. Maybe we should fix that, but we've discussed forever how to
>   make sure a bad link isn't ever stuck at "bad" for old userspace, so
>   we've gone with this. But maybe limiting to only allow_modeset cases
>   would also work.

Wait, what do you mean "include any connector"?

What exactly could cause a modeset instead of failure when
ALLOW_MODESET is not set?

Does that mean that I'll never need to implement link-status handling
in Weston, because the kernel will recover the link anyway? If the
kernel does that, then what's the point of having a link-status
property to begin with?

> - I guess we could do stuff that only fires off when allow_modeset is set,
>   but I haven't found some examples. I thought we've had some outside of
>   self-refresh helpers already. The closest I've found is again
>   link-status, which never allows userspace to set BAD, and silently
>   upgrades to GOOD. So that userspace doing a blind safe/restore can't
>   wreak stuff permanently.

Sounds like link-status was designed with a blind save/restore in mind.

> It's all a bit nasty :-/
> 
> I think we should at least allow userspace to do a blind restore with
> allow_modeset and not expect bad side-effects. That would mean fixing at
> least the content protection stuff.
> 
> Plus documenting this in the kernel somewhere. As the official thing to
> do. But maybe we want some actual userspace doing this before we enshrine
> it as official policy. The content protection fix is a one-liner and can
> be cc'ed stable.

I'd probably not go there, a blind save does not guarantee a good
state. The fix to "Content Protection" is not necessary (as long as
userspace does not do a blind save/restore) if we can get the default
state from the kernel. If we get the default state from the kernel,
then userspace would be doing a blind restore but not save, meaning
that the state actually is sane and writable.

I'd love to volunteer for writing the Weston code to make use of "get me
sane default state" UAPI, but I'm afraid I'm not in that much control
of my time.


Thanks,
pq

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-21 14:33               ` Pekka Paalanen
@ 2020-04-21 14:39                 ` Simon Ser
  2020-04-23 15:01                 ` Daniel Vetter
  1 sibling, 0 replies; 54+ messages in thread
From: Simon Ser @ 2020-04-21 14:39 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Benjamin Berg, David Airlie, Thomas Zimmermann,
	Christian Kellner, Javier Martinez Canillas, dri-devel,
	Hans de Goede, Mark Pearson, Rajat Jain, Nitin Joshi1

On Tuesday, April 21, 2020 4:33 PM, Pekka Paalanen <ppaalanen@gmail.com> wrote:

> I'd love to volunteer for writing the Weston code to make use of "get me
> sane default state" UAPI, but I'm afraid I'm not in that much control
> of my time.

I'm interested in this problem too. If someone writes the kernel side,
I'll gladly write user-space for it.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-21 12:37         ` Hans de Goede
  2020-04-21 12:40           ` Daniel Vetter
@ 2020-04-21 14:46           ` Pekka Paalanen
  2020-04-23 18:21             ` Rajat Jain
  1 sibling, 1 reply; 54+ messages in thread
From: Pekka Paalanen @ 2020-04-21 14:46 UTC (permalink / raw)
  To: Hans de Goede
  Cc: David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Thomas Zimmermann, Nitin Joshi1,
	Rajat Jain, Mark Pearson, Benjamin Berg


[-- Attachment #1.1: Type: text/plain, Size: 2535 bytes --]

On Tue, 21 Apr 2020 14:37:41 +0200
Hans de Goede <hdegoede@redhat.com> wrote:

> TL;DR: Yes there will be races, because of both userspace +
> the firmware having; and potentially using r/w access to
> the privacy-screen state. But in practice I expect these
> to not really be an issue. Important here is that userspace
> only commits the property in a transaction to commit if
> it actually intends to change the property so as to not
> needlessly create a situation where we might hit the race.
> 
> As for 1 vs 2 properties for this I guess that in preparation
> for potential devices where the state is locked, having a
> r/w sw-state + a ro hw-state property makes sense.
> 
> So I suggest that we replace the current "privacy-screen" property
> from Rajat's patch-set with 2 props named:
> 
> "privacy-screen-sw-state" (r/w)
> "privacy-screen-hw-state" (ro)
> 
> Where for current gen hardware the privacy-screen-hw-state is
> just a mirror of the sw-state.

Hi,

this sounds like a good plan to me, assuming the kernel writes only to
the ro property and never to the r/w property.

I understand that as long as firmware hotkeys will toggle actual state,
there is no design that could work reliably if userspace will always
commit all KMS state even when it is not necessary. But not committing
KMS state unless it is actually necessary is really a new requirement on
userspace, so that needs to be documented before it's too late.

It's not enough to document "don't set it unless you want to
overwrite/change it" for privacy screen properties. It needs to be
documented as a general rule that userspace must follow with *unknown*
properties as well. "Do not restore unrecognized properties unless the
kernel KMS state might be incorrect compared to what you used to have."

This means that with a display server that does not understand privacy
screen properties, the end user will lose the privacy screen state on
every VT-switch back to the display server.

However, if we had a way to query the kernel for the default state to
reset unknown properties to, the kernel implementation could return the
current value of the privacy screen property instead of "off" to not
lose the firmware state. Assuming firmware hotkeys exist, but if they
don't then return just "off". The point is that the kernel who knows
all the properties makes the decision what a sane reset value is.
Userspace can always override the reset value for the properties
it recognizes.


Thanks,
pq

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-21 14:33               ` Pekka Paalanen
  2020-04-21 14:39                 ` Simon Ser
@ 2020-04-23 15:01                 ` Daniel Vetter
  2020-04-24  8:32                   ` Pekka Paalanen
  1 sibling, 1 reply; 54+ messages in thread
From: Daniel Vetter @ 2020-04-23 15:01 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Benjamin Berg, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson

On Tue, Apr 21, 2020 at 4:33 PM Pekka Paalanen <ppaalanen@gmail.com> wrote:
>
> On Tue, 21 Apr 2020 14:15:52 +0200
> Daniel Vetter <daniel@ffwll.ch> wrote:
>
> > On Mon, Apr 20, 2020 at 01:04:20PM +0300, Pekka Paalanen wrote:
> > > On Mon, 20 Apr 2020 11:27:04 +0300
> > > Pekka Paalanen <ppaalanen@gmail.com> wrote:
> > >
> > > > On Fri, 17 Apr 2020 16:17:18 +0200
> > > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >
> > > > > On Fri, Apr 17, 2020 at 11:02 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > let's think about how userspace uses atomic KMS UAPI. The simplest way
> > > > > > to use atomic correctly is that userspace will for every update send the
> > > > > > full, complete set of all properties that exist, both known and unknown
> > > > > > to userspace (to recover from temporarily VT-switching to another KMS
> > > > > > program that changes unknown properties). Attempting to track which
> > > > > > properties already have their correct values in the kernel is extra
> > > > > > work for just extra bugs.
> > > > >
> > > > > Uh if you do that you'll get random surprising failures if you don't
> > > > > also set ALLOW_MODESET, because that way you'll automatically repair
> > > > > link failures and stuff like that. I'm assuming your userspace only
> > > > > supplies all the properties for crtc and planes, and leaves connectors
> > > > > as-is? Otherwise you already have some fun bugs.
> > > > >
> > > > > In general I'd say userspace shouldn't write stuff it doesn't
> > > > > understand. If you limit yourself to just the properties you do want
> > > > > to (re)set, that's safe. But if you just blindly write everything all
> > > > > the time, random modesets, and hence random failures if you don't set
> > > > > ALLOW_MODESET.
> > > >
> > > > Hi,
> > > >
> > > > how should userspace KMS program A recover from the situation when
> > > > switching the VT back from KMS program B who changed properties that
> > > > program A does not recognise? (I believe Weston does not recover at
> > > > the moment.) This is very important for getting e.g. reliable color
> > > > reproduction, since not all KMS programs are always up-to-date with
> > > > everything the kernel exposes and people may switch between them. Not
> > > > resetting everything may even encourage people to write hacks where you
> > > > temporarily VT-switch away, run a KMS program to set one property, and
> > > > then switch back assuming the property remains set. I have already seen
> > > > someone mention they can enable VRR behind the display server's back
> > > > like this.
> > > >
> > > > I don't think Weston records and re-sets unknown properties yet, but I
> > > > assumed it is what it needs to do to be able to reliably recover from
> > > > VT-switches. In that case ALLOW_MODESET is of course set since all
> > > > state is unknown and assumed bad.
> > > >
> > > > I do believe Weston re-submits *everything* it knows about every
> > > > update, except for CRTCs and connectors it has already disabled and
> > > > knows are in disabled state (this could change though).
> > > >
> > > > However, during steady-state operation when ALLOW_MODESET should not be
> > > > necessary, is it still harmful to re-program *all* properties on every
> > > > update?
> > > >
> > > > After all, the kernel will just no-op all property setting where the
> > > > value is already the right one, does it not?
> > > >
> > > > The only "random" KMS state is the properties the userspace KMS
> > > > program does not know that are set on start-up. I have been assuming
> > > > that as long as you had fbdev active before the KMS program started,
> > > > the unknown properties have "harmless" default values. And maybe even at
> > > > driver device init if fbdev does not exist?
> > > >
> > > > Is there something more up-to-date than
> > > > https://blog.ffwll.ch/2016/01/vt-switching-with-atomic-modeset.html ?
> >
> > Sadly, nothing changed since then.
> >
> > > Thinking more, would the below work?
> >
> > Yup, this would somewhat work. Except not always, I've found one case
> > where even this goes wrong:
> >
> > - Content-protection property, if enabled, has the kernel automatically
> >   switch to enabled if hdcp is actually on and authenticated and all that.
> >   Writing back that enabled value will fail. But there's good chances that
> >   at boot-up content protection isn't enabled, so should work out nicely.
> >
> > - We could fix this by silently downcasting enabled to requested, but
> >   might still lead to surprises since it makes hdcp rather more sticky
> >   than some users might like.
>
> The fix doesn't make HDCP any more or less sticky, it just makes it
> possible to not fail a resetting atomic commit. Without a resetting
> commit, it will remain DESIRED/ENABLED.
>
> If "Content Protection" reads back as DESIRED, userspace that relies on
> read-back for reset will reset it to DESIRED. Could as well be ENABLED
> that the kernel just takes as DESIRED when written.
>
> > > Actor: a KMS userspace program, e.g. a display server
> > >
> > > - On start-up, read all KMS properties and their values. The properties
> > >   that are not recognised are saved in a set called "reset unknowns"
> > >   with their current values.
> > >
> > >   Optional: The program commits the "reset unknown" state to KMS with
> > >   ALLOW_MODESET to ensure it all is writable as is; if that fails,
> > >   there is no guarantee that the program could recover later, so it's
> > >   best to abort in that case. This could be part of the initial
> > >   modeset, too.
> > >
> > > - When the program has lost and regained DRM master status, meaning
> > >   that (unrecognised) KMS state is potentially incorrect, prepare an
> > >   atomic commit with "reset unknowns" set and add all the recognised
> > >   state the program knows of on top. This resets everything to like it
> > >   was, with ALLOW_MODESET.
> > >
> > > - At any other time, do not use the "reset unknowns" set.
> > >
> > > The final point is the important one. I have assumed it would be safe
> > > to use always, but apparently not? Good thing I haven't yet written
> > > code to do that.
> > >
> > > You have to recognise the property to know if it is safe to set
> > > needlessly (for convenience in both code simplicity and ease of
> > > debugging)?
> > >
> > > Also, when using "reset unknowns" set, it actually has to be
> > > partitioned by KMS objects (CRTC, connector, plane...) so if e.g. a
> > > connector no longer exist, you don't attempt to set it.
> > >
> > > However, this still leaves writable properties whose value read is not
> > > legal to write as broken. Let's pray that fbcon or a system compositor
> > > will never succeed in enabling HDCP...
> >
> > Note that the kernel isn't entire consistent on this. I've looked a bit
> > more closely at stuff. Ignoring content protection I've found following
> > approaches things:
> >
> > - self refresh helpers, which are entirely transparent. Therefore we do a
> >   hack to set allow_modeset when the self-refresh helpers need to do a
> >   modeset, to avoid total surprise for userspace. I think this is only ok
> >   for these kind of behind-the-scenes helpers like self-refresh.
> >
> > - link-status is always reset to "good" when you include any connector,
> >   which might force a modeset. Even when allow_modeset isn't set by
> >   userspace. Maybe we should fix that, but we've discussed forever how to
> >   make sure a bad link isn't ever stuck at "bad" for old userspace, so
> >   we've gone with this. But maybe limiting to only allow_modeset cases
> >   would also work.
>
> Wait, what do you mean "include any connector"?
>
> What exactly could cause a modeset instead of failure when
> ALLOW_MODESET is not set?

If you do an atomic commit with the connector included that has the
bad link status, then it'll reset it to Good to try to get a full
modeset to restore stuff. If you don't also have ALLOW_MODESET set,
it'll fail and userspace might be sad and not understand what's going
on. We can easily fix this by only forcing the link training to be
fixed if userspace has set ALLOW_MODESET.

> Does that mean that I'll never need to implement link-status handling
> in Weston, because the kernel will recover the link anyway? If the
> kernel does that, then what's the point of having a link-status
> property to begin with?

Well generally all your compositor does all day long is flip buffers.
So you'll never get the kernel into restoring the link. Hence the
uevent, so that the compositor can a) update the mode list, which
might have changed b) do the modeset to restore stuff. The
auto-fallback is so that stuff like users manually
disabling/re-enabling an output actually helps with fixing stuff.

> > - I guess we could do stuff that only fires off when allow_modeset is set,
> >   but I haven't found some examples. I thought we've had some outside of
> >   self-refresh helpers already. The closest I've found is again
> >   link-status, which never allows userspace to set BAD, and silently
> >   upgrades to GOOD. So that userspace doing a blind safe/restore can't
> >   wreak stuff permanently.
>
> Sounds like link-status was designed with a blind save/restore in mind.

Yeah that part we didn't screw up.

> > It's all a bit nasty :-/
> >
> > I think we should at least allow userspace to do a blind restore with
> > allow_modeset and not expect bad side-effects. That would mean fixing at
> > least the content protection stuff.
> >
> > Plus documenting this in the kernel somewhere. As the official thing to
> > do. But maybe we want some actual userspace doing this before we enshrine
> > it as official policy. The content protection fix is a one-liner and can
> > be cc'ed stable.
>
> I'd probably not go there, a blind save does not guarantee a good
> state. The fix to "Content Protection" is not necessary (as long as
> userspace does not do a blind save/restore) if we can get the default
> state from the kernel. If we get the default state from the kernel,
> then userspace would be doing a blind restore but not save, meaning
> that the state actually is sane and writable.
>
> I'd love to volunteer for writing the Weston code to make use of "get me
> sane default state" UAPI, but I'm afraid I'm not in that much control
> of my time.

The problem is, what is your default state? Driver defaults (generally
fairly random and mostly everything is off)? After fbcon has done
it's, which might never happen when you've disabling fbdev/fbcon?
Boot-up state from the firmware for drivers like i915 that support
fastboot (and what if that's garbage)? These can all be different too.

The hard part isn't the uapi, it's the implementation and defining
what you mean. Generally everyone wants their own version of default
value semantics, so this a) encodes policy, which we try to avoid in
the kernel and b) will be epic amounts of endless bikeshedding and
fighting between use-cases.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-21 14:46           ` Pekka Paalanen
@ 2020-04-23 18:21             ` Rajat Jain
  2020-04-24  7:40               ` Pekka Paalanen
  0 siblings, 1 reply; 54+ messages in thread
From: Rajat Jain @ 2020-04-23 18:21 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Thomas Zimmermann,
	Nitin Joshi1, Mark Pearson, Benjamin Berg

On Tue, Apr 21, 2020 at 7:46 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
>
> On Tue, 21 Apr 2020 14:37:41 +0200
> Hans de Goede <hdegoede@redhat.com> wrote:
>
> > TL;DR: Yes there will be races, because of both userspace +
> > the firmware having; and potentially using r/w access to
> > the privacy-screen state. But in practice I expect these
> > to not really be an issue. Important here is that userspace
> > only commits the property in a transaction to commit if
> > it actually intends to change the property so as to not
> > needlessly create a situation where we might hit the race.
> >
> > As for 1 vs 2 properties for this I guess that in preparation
> > for potential devices where the state is locked, having a
> > r/w sw-state + a ro hw-state property makes sense.
> >
> > So I suggest that we replace the current "privacy-screen" property
> > from Rajat's patch-set with 2 props named:
> >
> > "privacy-screen-sw-state" (r/w)
> > "privacy-screen-hw-state" (ro)
> >
> > Where for current gen hardware the privacy-screen-hw-state is
> > just a mirror of the sw-state.

Just to make sure I understand the semantics correctly:

- The  "privacy-screen-hw-state" shall be read-only, and can be modified by:
      - Hardware (e.g. HW kill switch).
      - Firmware.
      - (Potentially) needs a notification/irq to the kernel when this
changes (or may be kernel can read it only when userspace queries for
it).

- The "privacy-screen-sw-state" shall be read-write, and can only be
modified by user space.
        - If user space toggles it, the kernel will attempt to
"request" the change to hardware.
        - Whether the request to hardware was successful or not, the
"privacy-screen-sw-state" will always reflect the latest value
userspace wrote.
        - If the request to hardware was successful, the
"privacy-screen-hw-state" will also change (probably via a separate
notification/irq from HW).
        - We expect the user space to write to
"privacy-screen-sw-state" only if it really wants to toggle the value.

What is not clear to me is if any change to"privacy-screen-hw-state"
shall be propagated to "privacy-screen-sw-state"?
 - If yes, then I think we are not solving any problems of single property.
 - If no, then why do we require userspace to write to sw state only
if something has changed?

Also, it seems to me that in my current patchset, the property I have
already behaves like "privacy-screen-sw-state". Do I just need to
rename it?

Thanks,

Rajat

>
> Hi,
>
> this sounds like a good plan to me, assuming the kernel writes only to
> the ro property and never to the r/w property.
>
> I understand that as long as firmware hotkeys will toggle actual state,
> there is no design that could work reliably if userspace will always
> commit all KMS state even when it is not necessary. But not committing
> KMS state unless it is actually necessary is really For implementing the "privacy-screen-sw-state".a new requirement on
> userspace, so that needs to be documented before it's too late.
>
> It's not enough to document "don't set it unless you want to
> overwrite/change it" for privacy screen properties. It needs to be
> documented as a general rule that userspace must follow with *unknown*
> properties as well. "Do not restore unrecognized properties unless the
> kernel KMS state might be incorrect compared to what you used to have."
>
> This means that with a display server that does not understand privacy
> screen properties, the end user will lose the privacy screen state on
> every VT-switch back to the display server.
>
> However, if we had a way to query the kernel for the default state to
> reset unknown properties to, the kernel implementation could return the
> current value of the privacy screen property instead of "off" to not
> lose the firmware state. Assuming firmware hotkeys exist, but if they
> don't then return just "off". The point is that the kernel who knows
> all the properties makes the decision what a sane reset value is.
> Userspace can always override the reset value for the properties
> it recognizes.
>
>
> Thanks,For implementing the "privacy-screen-sw-state".
> pq
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-23 18:21             ` Rajat Jain
@ 2020-04-24  7:40               ` Pekka Paalanen
  2020-04-24  8:24                 ` Hans de Goede
  0 siblings, 1 reply; 54+ messages in thread
From: Pekka Paalanen @ 2020-04-24  7:40 UTC (permalink / raw)
  To: Rajat Jain
  Cc: David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Thomas Zimmermann,
	Nitin Joshi1, Mark Pearson, Benjamin Berg


[-- Attachment #1.1: Type: text/plain, Size: 5138 bytes --]

On Thu, 23 Apr 2020 11:21:47 -0700
Rajat Jain <rajatja@google.com> wrote:

> On Tue, Apr 21, 2020 at 7:46 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> >
> > On Tue, 21 Apr 2020 14:37:41 +0200
> > Hans de Goede <hdegoede@redhat.com> wrote:
> >  
> > > TL;DR: Yes there will be races, because of both userspace +
> > > the firmware having; and potentially using r/w access to
> > > the privacy-screen state. But in practice I expect these
> > > to not really be an issue. Important here is that userspace
> > > only commits the property in a transaction to commit if
> > > it actually intends to change the property so as to not
> > > needlessly create a situation where we might hit the race.
> > >
> > > As for 1 vs 2 properties for this I guess that in preparation
> > > for potential devices where the state is locked, having a
> > > r/w sw-state + a ro hw-state property makes sense.
> > >
> > > So I suggest that we replace the current "privacy-screen" property
> > > from Rajat's patch-set with 2 props named:
> > >
> > > "privacy-screen-sw-state" (r/w)
> > > "privacy-screen-hw-state" (ro)
> > >
> > > Where for current gen hardware the privacy-screen-hw-state is
> > > just a mirror of the sw-state.  
> 
> Just to make sure I understand the semantics correctly:
> 
> - The  "privacy-screen-hw-state" shall be read-only, and can be modified by:
>       - Hardware (e.g. HW kill switch).
>       - Firmware.
>       - (Potentially) needs a notification/irq to the kernel when this
> changes (or may be kernel can read it only when userspace queries for
> it).
> 
> - The "privacy-screen-sw-state" shall be read-write, and can only be
> modified by user space.
>         - If user space toggles it, the kernel will attempt to
> "request" the change to hardware.
>         - Whether the request to hardware was successful or not, the
> "privacy-screen-sw-state" will always reflect the latest value
> userspace wrote.
>         - If the request to hardware was successful, the
> "privacy-screen-hw-state" will also change (probably via a separate
> notification/irq from HW).
>         - We expect the user space to write to
> "privacy-screen-sw-state" only if it really wants to toggle the value.

Hi,

yes, to my understanding, that seems to be the correct idea from this
thread. The hw-state property must reflect the actual hardware state at
all times.

However, when userspace sets "privacy-screen-sw-state", the driver
should attempt to change hardware state regardless of whether the
"privacy-screen-sw-state" value changes compared to its old value or
not. Otherwise userspace cannot intentionally override a hardware
hotkey setting if possible (or would need two atomic commits to do it).

Mind, the above paragraph is only what I interpreted from this email
thread here. Previously I did not think that with atomic KMS, setting a
property to a value it already has could trigger anything. But I guess
it can?

This design is based on that it can.

> What is not clear to me is if any change to"privacy-screen-hw-state"
> shall be propagated to "privacy-screen-sw-state"?
>  - If yes, then I think we are not solving any problems of single property.
>  - If no, then why do we require userspace to write to sw state only
> if something has changed?

No. As already written, the kernel must not change the value of
"privacy-screen-sw-state", only userspace can.

Let's assume that you have a firmware-implemented hardware hotkey for
toggling the shield. The driver also successfully implements
"privacy-screen-sw-state" meaning that writing to it will set the
hardware shield state. If userspace was writing
"privacy-screen-sw-state" even when it does not intend to change
hardware state, it would almost immediately override any state set by
the hardware hotkey, making the hardware hotkey (randomly) not work.

This assumes that the hardware hotkey is a momentary switch that does
not stop software from controlling the shield too.

If the hardware hotkey can stop software from changing the shield
state, then it might not be necessary for userspace to avoid unneeded
setting of the property. But that depends on which way the hotkey works
and which way users want to use it, so it's still best for userspace to
not set the property unless it really intends to apply a change.

If possible, it would be good to make this case the prime example of
how to correctly implement KMS properties for a hardware feature that
can be controlled (and fought over) by both userspace and
hardware/firmware. It seems like the same design can also work with
hardware switches that force the hardware state to be one or the other,
stopping userspace from changing it. Therefore I'd avoid incorporating
any specific shield use cases in the design, e.g. "if hw switch is set
to shield-on, userspace cannot turn shield off".


Thanks,
pq


> 
> Also, it seems to me that in my current patchset, the property I have
> already behaves like "privacy-screen-sw-state". Do I just need to
> rename it?
> 
> Thanks,
> 
> Rajat

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-24  7:40               ` Pekka Paalanen
@ 2020-04-24  8:24                 ` Hans de Goede
  2020-04-24  9:08                   ` Pekka Paalanen
  0 siblings, 1 reply; 54+ messages in thread
From: Hans de Goede @ 2020-04-24  8:24 UTC (permalink / raw)
  To: Pekka Paalanen, Rajat Jain
  Cc: David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Thomas Zimmermann, Nitin Joshi1,
	Mark Pearson, Benjamin Berg

Hi all,

Pekka, Rajat,

Thank you for your input in this.

On 4/24/20 9:40 AM, Pekka Paalanen wrote:
> On Thu, 23 Apr 2020 11:21:47 -0700
> Rajat Jain <rajatja@google.com> wrote:
> 
>> On Tue, Apr 21, 2020 at 7:46 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
>>>
>>> On Tue, 21 Apr 2020 14:37:41 +0200
>>> Hans de Goede <hdegoede@redhat.com> wrote:
>>>   
>>>> TL;DR: Yes there will be races, because of both userspace +
>>>> the firmware having; and potentially using r/w access to
>>>> the privacy-screen state. But in practice I expect these
>>>> to not really be an issue. Important here is that userspace
>>>> only commits the property in a transaction to commit if
>>>> it actually intends to change the property so as to not
>>>> needlessly create a situation where we might hit the race.
>>>>
>>>> As for 1 vs 2 properties for this I guess that in preparation
>>>> for potential devices where the state is locked, having a
>>>> r/w sw-state + a ro hw-state property makes sense.
>>>>
>>>> So I suggest that we replace the current "privacy-screen" property
>>>> from Rajat's patch-set with 2 props named:
>>>>
>>>> "privacy-screen-sw-state" (r/w)
>>>> "privacy-screen-hw-state" (ro)
>>>>
>>>> Where for current gen hardware the privacy-screen-hw-state is
>>>> just a mirror of the sw-state.
>>
>> Just to make sure I understand the semantics correctly:
>>
>> - The  "privacy-screen-hw-state" shall be read-only, and can be modified by:
>>        - Hardware (e.g. HW kill switch).
>>        - Firmware.
>>        - (Potentially) needs a notification/irq to the kernel when this
>> changes (or may be kernel can read it only when userspace queries for
>> it).
>>
>> - The "privacy-screen-sw-state" shall be read-write, and can only be
>> modified by user space.
>>          - If user space toggles it, the kernel will attempt to
>> "request" the change to hardware.
>>          - Whether the request to hardware was successful or not, the
>> "privacy-screen-sw-state" will always reflect the latest value
>> userspace wrote.
>>          - If the request to hardware was successful, the
>> "privacy-screen-hw-state" will also change (probably via a separate
>> notification/irq from HW).
>>          - We expect the user space to write to
>> "privacy-screen-sw-state" only if it really wants to toggle the value.
> 
> Hi,
> 
> yes, to my understanding, that seems to be the correct idea from this
> thread. The hw-state property must reflect the actual hardware state at
> all times.

Agree on the hw-state prop reflecting the actual hardware state at
all times, that one is easy.

> However, when userspace sets "privacy-screen-sw-state", the driver
> should attempt to change hardware state regardless of whether the
> "privacy-screen-sw-state" value changes compared to its old value or
> not. Otherwise userspace cannot intentionally override a hardware
> hotkey setting if possible (or would need two atomic commits to do it).

Ack / agreed.

> Mind, the above paragraph is only what I interpreted from this email
> thread here. Previously I did not think that with atomic KMS, setting a
> property to a value it already has could trigger anything. But I guess
> it can?

In a way. My idea for the "privacy-screen-sw-state" is for it to reflect
the last requested value, where the request could come from either a
firmware controlled hotkey; or from userspace (this seems to be where
our ideas of how to handle this diverts).

So what can happen is (with both props being always in sync)
-userspace reads privacy screen being off
-user toggles privacy screen on through firmware controlled hotkey
-kernel gets notified about state toggle, updates both property
  states to on
-userspace commits its old knowledge of the property (off), thereby
  triggering the kernel to turn the privacy screen back off

So in this case from the kernel pov the property is actually set
to a new value, not to "a value it already has".

Note there can be races here of course, but lets ignore those (for now).
Both the hotkey event as well as userspace changing the setting will be
end-user triggered events and will not happen at high frequency.
Also I see no way to completely eliminate racing here. Luckily the side
effects of the race or pretty harmless (annoying maybe, but not causing
crashes, etc).

> This design is based on that it can.
> 
>> What is not clear to me is if any change to"privacy-screen-hw-state"
>> shall be propagated to "privacy-screen-sw-state"?
>>   - If yes, then I think we are not solving any problems of single property.
>>   - If no, then why do we require userspace to write to sw state only
>> if something has changed?
>
> No. As already written, the kernel must not change the value of
> "privacy-screen-sw-state", only userspace can.

So this is where out view of how to handle this differs, I do
not see the hotkey changing the state as different from userspace
changing it. The reason for me to have both a sw- and a hw-state
is in case there is a physical switch (typically a slider style
switch) which forces the state to on / off. In this case userspace
could still set the "privacy-screen-sw-state" prop and then
the 2 could differ.

Lets add one more complication to this, which I think helps.
Currently the thinkpad_acpi driver exports the privacy screen as:

/proc/acpi/ibm/lcdshadow

Userspace can write this and then change the privacy-screen
setting, this is in shipped kernels and cannot be dropped
because it is part if the kernel's uABI now.
This means that another userspace process can change the
property underneath a kms client. I do not see how this is
different from the firmware changing the setting based on
a hotkey press. Yet if we stick with your "only userspace can"
change the sw-state setting, then does this count as userspace,
or do you mean only a kms client can ?  And then how is
another kms-client changing the setting different ?

So to me to avoid confusion the only valid case where the
hw- and sw-state can differ is if userspace requests
say "off" as state while the privacy screen is forced on
by say a physical switch (or e.g. a BIOS option to lock it?).

Then we would remember the off in sw-state but hw-state would
still be on.

I guess that maybe for the enum of the hw-state we need 4 values
instead of 2:

Enabled
Disabled
Enabled, locked
Disabled, locked

To indicate to userspace that atm the state cannot be changed.

If userspace then still changes sw-state we cache it and apply
this if the privacy screen control gets unlocked.

On hardware where there is no "lock" the 2 properties will simply
always be the same.

> Let's assume that you have a firmware-implemented hardware hotkey for
> toggling the shield. The driver also successfully implements
> "privacy-screen-sw-state" meaning that writing to it will set the
> hardware shield state. If userspace was writing
> "privacy-screen-sw-state" even when it does not intend to change
> hardware state, it would almost immediately override any state set by
> the hardware hotkey, making the hardware hotkey (randomly) not work.

Right, this is why userspace should not set the property unless
it really means to change it, even then things could still race,
but as explained above that should normally never happen and luckily
the side-effects of hitting the race somehow are not that bad.

> This assumes that the hardware hotkey is a momentary switch that does
> not stop software from controlling the shield too.

This is correct for the Lenovo / thinkpad_acpi case.

> If the hardware hotkey can stop software from changing the shield
> state, then it might not be necessary for userspace to avoid unneeded
> setting of the property. But that depends on which way the hotkey works
> and which way users want to use it, so it's still best for userspace to
> not set the property unless it really intends to apply a change.

Ack.

> If possible, it would be good to make this case the prime example of
> how to correctly implement KMS properties for a hardware feature that
> can be controlled (and fought over) by both userspace and
> hardware/firmware. It seems like the same design can also work with
> hardware switches that force the hardware state to be one or the other,
> stopping userspace from changing it. Therefore I'd avoid incorporating
> any specific shield use cases in the design, e.g. "if hw switch is set
> to shield-on, userspace cannot turn shield off".

I agree that it would be good to make this the prime example of
how to deal with similar cases.

>> Also, it seems to me that in my current patchset, the property I have
>> already behaves like "privacy-screen-sw-state". Do I just need to
>> rename it?

Maybe, it looks like we first need to figure out the exact semantics
of all this.

Regards,

Hans

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-23 15:01                 ` Daniel Vetter
@ 2020-04-24  8:32                   ` Pekka Paalanen
  2020-04-28 14:51                     ` Daniel Vetter
  0 siblings, 1 reply; 54+ messages in thread
From: Pekka Paalanen @ 2020-04-24  8:32 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Benjamin Berg, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson


[-- Attachment #1.1: Type: text/plain, Size: 9254 bytes --]

On Thu, 23 Apr 2020 17:01:49 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:

> On Tue, Apr 21, 2020 at 4:33 PM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> >
> > On Tue, 21 Apr 2020 14:15:52 +0200
> > Daniel Vetter <daniel@ffwll.ch> wrote:
> >  

...

> > > Note that the kernel isn't entire consistent on this. I've looked a bit
> > > more closely at stuff. Ignoring content protection I've found following
> > > approaches things:
> > >
> > > - self refresh helpers, which are entirely transparent. Therefore we do a
> > >   hack to set allow_modeset when the self-refresh helpers need to do a
> > >   modeset, to avoid total surprise for userspace. I think this is only ok
> > >   for these kind of behind-the-scenes helpers like self-refresh.
> > >
> > > - link-status is always reset to "good" when you include any connector,
> > >   which might force a modeset. Even when allow_modeset isn't set by
> > >   userspace. Maybe we should fix that, but we've discussed forever how to
> > >   make sure a bad link isn't ever stuck at "bad" for old userspace, so
> > >   we've gone with this. But maybe limiting to only allow_modeset cases
> > >   would also work.  
> >
> > Wait, what do you mean "include any connector"?
> >
> > What exactly could cause a modeset instead of failure when
> > ALLOW_MODESET is not set?  
> 
> If you do an atomic commit with the connector included that has the
> bad link status, then it'll reset it to Good to try to get a full
> modeset to restore stuff. If you don't also have ALLOW_MODESET set,
> it'll fail and userspace might be sad and not understand what's going
> on. We can easily fix this by only forcing the link training to be
> fixed if userspace has set ALLOW_MODESET.

Hi,

oh, that's ok, the fail case is there like it should.

It sounded like even if userspace does not set ALLOW_MODESET, the
kernel would still do a modeset in come cases. I'm happy to have
misunderstood.

> > Does that mean that I'll never need to implement link-status handling
> > in Weston, because the kernel will recover the link anyway? If the
> > kernel does that, then what's the point of having a link-status
> > property to begin with?  
> 
> Well generally all your compositor does all day long is flip buffers.
> So you'll never get the kernel into restoring the link. Hence the
> uevent, so that the compositor can a) update the mode list, which
> might have changed b) do the modeset to restore stuff. The
> auto-fallback is so that stuff like users manually
> disabling/re-enabling an output actually helps with fixing stuff.

Ok, that's fine.

If link goes bad while Weston does not implement link-status, I'm
totally happy to degree any fallout from that to be a Weston bug. It
has never worked (right?), so it cannot be a kernel regression. It is
quite important to me to be able to say to that a failure is a Weston
bug, not a kernel regression, to not piss on the kernel devs.

> > > - I guess we could do stuff that only fires off when allow_modeset is set,
> > >   but I haven't found some examples. I thought we've had some outside of
> > >   self-refresh helpers already. The closest I've found is again
> > >   link-status, which never allows userspace to set BAD, and silently
> > >   upgrades to GOOD. So that userspace doing a blind safe/restore can't
> > >   wreak stuff permanently.  
> >
> > Sounds like link-status was designed with a blind save/restore in mind.  
> 
> Yeah that part we didn't screw up.
> 
> > > It's all a bit nasty :-/
> > >
> > > I think we should at least allow userspace to do a blind restore with
> > > allow_modeset and not expect bad side-effects. That would mean fixing at
> > > least the content protection stuff.
> > >
> > > Plus documenting this in the kernel somewhere. As the official thing to
> > > do. But maybe we want some actual userspace doing this before we enshrine
> > > it as official policy. The content protection fix is a one-liner and can
> > > be cc'ed stable.  
> >
> > I'd probably not go there, a blind save does not guarantee a good
> > state. The fix to "Content Protection" is not necessary (as long as
> > userspace does not do a blind save/restore) if we can get the default
> > state from the kernel. If we get the default state from the kernel,
> > then userspace would be doing a blind restore but not save, meaning
> > that the state actually is sane and writable.
> >
> > I'd love to volunteer for writing the Weston code to make use of "get me
> > sane default state" UAPI, but I'm afraid I'm not in that much control
> > of my time.  
> 
> The problem is, what is your default state? Driver defaults (generally
> fairly random and mostly everything is off)? After fbcon has done
> it's, which might never happen when you've disabling fbdev/fbcon?
> Boot-up state from the firmware for drivers like i915 that support
> fastboot (and what if that's garbage)? These can all be different too.

I believe what I want is more or less the driver defaults, or DRM core
defaults for common KMS properties. It does not matter if they are
arbitrary, as long as they remain the same across kernel versions. They
need to be constants, so that I can rely on always getting to the same
state on the same piece of hardware when I use the "default state"
regardless of what might be happening software wise.

But of course the defaults must be legal values and they should make
some sense. I'd consider "neutral", "identity", and "off" to be
sensible values, but what that means for each property depends on the
exact property. So there may be work to be done to unify the default
values across drivers for driver-specific properties.

Having stuff off by default is ok. If that makes things not work, then
we'll fix userspace to understand the new properties that are
necessary. At no point it can be said to be a kernel regression - at
least I very much hope and intend so.

There is always the problem of someone using new userspace first on old
kernel that does not expose "default state" so userspace doesn't use
"default state". Then upgrading to new kernel that has "default state"
that is not good for making things work, but userspace starts using it
because it becomes available and fails. Is there any way we can stop
that from being seen as a kernel regression?

Fbcon has exactly the problem that fbcon might not exist, so we cannot
rely on that. Is fbcon even reliable enough to offer constant defaults
that could be relied upon to stay the same across kernel versions and
software configurations?

State from firmware I would not trust at all, who knows when a firmware
upgrade changes things.

What I'm saying is, the "default state" would very much be UABI. Not
just how to access it, but the exact values of it for all existing
properties. The important invariant is: for the same piece of hardware,
I always get the same defaults (if any), regardless of any software or
firmware versions.

This invariant should guarantee that e.g. measured monitor color
profiles stay valid across kernel upgrades. I know, people who actually
care about color do not trust even reboots, but if we can let them get
away with just verifying the profile instead of wasting hours in
re-measuring the profile, that would be good.

When a driver starts supporting a new property, it must set the default
value of the property to the value it has been using implicitly before.
But if that implicit value was not hardcoded and depends on variable
factors, then we probably cannot avoid a change in behaviour when
introducing the property or "default values".

> The hard part isn't the uapi, it's the implementation and defining
> what you mean. Generally everyone wants their own version of default
> value semantics, so this a) encodes policy, which we try to avoid in
> the kernel and b) will be epic amounts of endless bikeshedding and
> fighting between use-cases.

It's a policy, yes. But we need a single source of arbitrary default
values, so that we have a strategy for achieving the same hardware
state even when userspace does not understand all KMS properties:

- after losing and re-gaining DRM master status (e.g. temporary
  VT-switch to another KMS program that modifies state I don't
  understand), and

- across changes in the software stack.

The former is the most important point, the latter would be good to
have.

We already have a policy in the kernel: fbcon KMS state. If everyone
can agree to that state, it's fine by me, but it also needs to be
available when fbcon support is disabled. Maybe this could be a path
forward? Expose what state fbcon would use, regardless of whether fbcon
exists. It would probably miss the latter point, though, but that could
be acceptable.

If we could trust that the KMS state is "sane" when any KMS program
starts, then this problem doesn't exist: just read out the KMS state on
start-up and use that. But since we don't know where the existing KMS
state comes from, it could be anything. And for smooth hand-off between
display servers, we can't have per DRM file description KMS state
either.


Thanks,
pq

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-24  8:24                 ` Hans de Goede
@ 2020-04-24  9:08                   ` Pekka Paalanen
  2020-04-24 10:32                     ` Hans de Goede
  0 siblings, 1 reply; 54+ messages in thread
From: Pekka Paalanen @ 2020-04-24  9:08 UTC (permalink / raw)
  To: Hans de Goede
  Cc: David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Thomas Zimmermann, Nitin Joshi1,
	Rajat Jain, Mark Pearson, Benjamin Berg


[-- Attachment #1.1: Type: text/plain, Size: 11552 bytes --]

On Fri, 24 Apr 2020 10:24:31 +0200
Hans de Goede <hdegoede@redhat.com> wrote:

> Hi all,
> 
> Pekka, Rajat,
> 
> Thank you for your input in this.
> 
> On 4/24/20 9:40 AM, Pekka Paalanen wrote:
> > On Thu, 23 Apr 2020 11:21:47 -0700
> > Rajat Jain <rajatja@google.com> wrote:
> >   
> >> On Tue, Apr 21, 2020 at 7:46 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:  
> >>>
> >>> On Tue, 21 Apr 2020 14:37:41 +0200
> >>> Hans de Goede <hdegoede@redhat.com> wrote:
> >>>     
> >>>> TL;DR: Yes there will be races, because of both userspace +
> >>>> the firmware having; and potentially using r/w access to
> >>>> the privacy-screen state. But in practice I expect these
> >>>> to not really be an issue. Important here is that userspace
> >>>> only commits the property in a transaction to commit if
> >>>> it actually intends to change the property so as to not
> >>>> needlessly create a situation where we might hit the race.
> >>>>
> >>>> As for 1 vs 2 properties for this I guess that in preparation
> >>>> for potential devices where the state is locked, having a
> >>>> r/w sw-state + a ro hw-state property makes sense.
> >>>>
> >>>> So I suggest that we replace the current "privacy-screen" property
> >>>> from Rajat's patch-set with 2 props named:
> >>>>
> >>>> "privacy-screen-sw-state" (r/w)
> >>>> "privacy-screen-hw-state" (ro)
> >>>>
> >>>> Where for current gen hardware the privacy-screen-hw-state is
> >>>> just a mirror of the sw-state.  
> >>
> >> Just to make sure I understand the semantics correctly:
> >>
> >> - The  "privacy-screen-hw-state" shall be read-only, and can be modified by:
> >>        - Hardware (e.g. HW kill switch).
> >>        - Firmware.
> >>        - (Potentially) needs a notification/irq to the kernel when this
> >> changes (or may be kernel can read it only when userspace queries for
> >> it).
> >>
> >> - The "privacy-screen-sw-state" shall be read-write, and can only be
> >> modified by user space.
> >>          - If user space toggles it, the kernel will attempt to
> >> "request" the change to hardware.
> >>          - Whether the request to hardware was successful or not, the
> >> "privacy-screen-sw-state" will always reflect the latest value
> >> userspace wrote.
> >>          - If the request to hardware was successful, the
> >> "privacy-screen-hw-state" will also change (probably via a separate
> >> notification/irq from HW).
> >>          - We expect the user space to write to
> >> "privacy-screen-sw-state" only if it really wants to toggle the value.  
> > 
> > Hi,
> > 
> > yes, to my understanding, that seems to be the correct idea from this
> > thread. The hw-state property must reflect the actual hardware state at
> > all times.  
> 
> Agree on the hw-state prop reflecting the actual hardware state at
> all times, that one is easy.
> 
> > However, when userspace sets "privacy-screen-sw-state", the driver
> > should attempt to change hardware state regardless of whether the
> > "privacy-screen-sw-state" value changes compared to its old value or
> > not. Otherwise userspace cannot intentionally override a hardware
> > hotkey setting if possible (or would need two atomic commits to do it).  
> 
> Ack / agreed.
> 
> > Mind, the above paragraph is only what I interpreted from this email
> > thread here. Previously I did not think that with atomic KMS, setting a
> > property to a value it already has could trigger anything. But I guess
> > it can?  
> 
> In a way. My idea for the "privacy-screen-sw-state" is for it to reflect
> the last requested value, where the request could come from either a
> firmware controlled hotkey; or from userspace (this seems to be where
> our ideas of how to handle this diverts).
> 
> So what can happen is (with both props being always in sync)
> -userspace reads privacy screen being off
> -user toggles privacy screen on through firmware controlled hotkey
> -kernel gets notified about state toggle, updates both property
>   states to on
> -userspace commits its old knowledge of the property (off), thereby
>   triggering the kernel to turn the privacy screen back off
> 
> So in this case from the kernel pov the property is actually set
> to a new value, not to "a value it already has".

Hi,

that is an interesting point of view.

You are keeping the separation between "wanted" and "actual" state, but
counting firmware/hardware hotkeys as "want" instead of letting them
silently change hardware state.

That seems ok.

> Note there can be races here of course, but lets ignore those (for now).
> Both the hotkey event as well as userspace changing the setting will be
> end-user triggered events and will not happen at high frequency.
> Also I see no way to completely eliminate racing here. Luckily the side
> effects of the race or pretty harmless (annoying maybe, but not causing
> crashes, etc).
> 
> > This design is based on that it can.
> >   
> >> What is not clear to me is if any change to"privacy-screen-hw-state"
> >> shall be propagated to "privacy-screen-sw-state"?
> >>   - If yes, then I think we are not solving any problems of single property.
> >>   - If no, then why do we require userspace to write to sw state only
> >> if something has changed?  
> >
> > No. As already written, the kernel must not change the value of
> > "privacy-screen-sw-state", only userspace can.  
> 
> So this is where out view of how to handle this differs, I do
> not see the hotkey changing the state as different from userspace
> changing it. The reason for me to have both a sw- and a hw-state
> is in case there is a physical switch (typically a slider style
> switch) which forces the state to on / off. In this case userspace
> could still set the "privacy-screen-sw-state" prop and then
> the 2 could differ.

Yes, the locked switch case definitely makes sense to me.

If userspace has to avoid setting the sw property unless it actually
intends to change it, then the sw property being controlled from
multiple sources (firmware, hotkey, the /proc file below) could work.
It would even tell the KMS client when someone else changed the
"wanted" state.

> Lets add one more complication to this, which I think helps.
> Currently the thinkpad_acpi driver exports the privacy screen as:
> 
> /proc/acpi/ibm/lcdshadow
> 
> Userspace can write this and then change the privacy-screen
> setting, this is in shipped kernels and cannot be dropped
> because it is part if the kernel's uABI now.
> This means that another userspace process can change the
> property underneath a kms client. I do not see how this is
> different from the firmware changing the setting based on
> a hotkey press. Yet if we stick with your "only userspace can"
> change the sw-state setting, then does this count as userspace,
> or do you mean only a kms client can ?  And then how is
> another kms-client changing the setting different ?

To me that would be similar to firmware changing hardware state: it's
not the KMS client (the display server) doing it, but something else
behind its back while it thinks it's in full control.

Doing things behind the display server's back is what creates all the
mess here.

Another KMS client cannot set the property behind the display server's
back, because the display server is holding DRM master, and the
property cannot be written without DRM master status. If the display
server drops DRM master, it knows it probably lost all state.

> So to me to avoid confusion the only valid case where the
> hw- and sw-state can differ is if userspace requests
> say "off" as state while the privacy screen is forced on
> by say a physical switch (or e.g. a BIOS option to lock it?).
> 
> Then we would remember the off in sw-state but hw-state would
> still be on.
> 
> I guess that maybe for the enum of the hw-state we need 4 values
> instead of 2:
> 
> Enabled
> Disabled
> Enabled, locked
> Disabled, locked
> 
> To indicate to userspace that atm the state cannot be changed.

If userspace needs that information, sure.

This makes me think that a driver needs to handle different types of
switches/hotkeys through different properties:

- For a hardware latching switch that forces the shield state to be one
  and not the other, it should change the hw-state but it does not seem
  to have a reason to change the sw-state property: it's not a "want",
  it's a hard setting. Changing sw-state would also lose the userspace
  preference of the setting.

- For momentary button or a hotkey that is supposed to just toggle the
  shield state, it would toggle sw-state property since it's a "want"
  that can be overridden. Setting the property leads to changing the
  hw-state as well (if not locked).

Does that make sense?

Maybe this is the best compromise given the display server cannot be in
full control.

> If userspace then still changes sw-state we cache it and apply
> this if the privacy screen control gets unlocked.

Sounds good.

> On hardware where there is no "lock" the 2 properties will simply
> always be the same.

Ok.


Thanks,
pq

> > Let's assume that you have a firmware-implemented hardware hotkey for
> > toggling the shield. The driver also successfully implements
> > "privacy-screen-sw-state" meaning that writing to it will set the
> > hardware shield state. If userspace was writing
> > "privacy-screen-sw-state" even when it does not intend to change
> > hardware state, it would almost immediately override any state set by
> > the hardware hotkey, making the hardware hotkey (randomly) not work.  
> 
> Right, this is why userspace should not set the property unless
> it really means to change it, even then things could still race,
> but as explained above that should normally never happen and luckily
> the side-effects of hitting the race somehow are not that bad.
> 
> > This assumes that the hardware hotkey is a momentary switch that does
> > not stop software from controlling the shield too.  
> 
> This is correct for the Lenovo / thinkpad_acpi case.
> 
> > If the hardware hotkey can stop software from changing the shield
> > state, then it might not be necessary for userspace to avoid unneeded
> > setting of the property. But that depends on which way the hotkey works
> > and which way users want to use it, so it's still best for userspace to
> > not set the property unless it really intends to apply a change.  
> 
> Ack.
> 
> > If possible, it would be good to make this case the prime example of
> > how to correctly implement KMS properties for a hardware feature that
> > can be controlled (and fought over) by both userspace and
> > hardware/firmware. It seems like the same design can also work with
> > hardware switches that force the hardware state to be one or the other,
> > stopping userspace from changing it. Therefore I'd avoid incorporating
> > any specific shield use cases in the design, e.g. "if hw switch is set
> > to shield-on, userspace cannot turn shield off".  
> 
> I agree that it would be good to make this the prime example of
> how to deal with similar cases.
> 
> >> Also, it seems to me that in my current patchset, the property I have
> >> already behaves like "privacy-screen-sw-state". Do I just need to
> >> rename it?  
> 
> Maybe, it looks like we first need to figure out the exact semantics
> of all this.
> 
> Regards,
> 
> Hans
> 


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: RFC: Drm-connector properties managed by another driver / privacy screen support
  2020-04-24  9:08                   ` Pekka Paalanen
@ 2020-04-24 10:32                     ` Hans de Goede
  0 siblings, 0 replies; 54+ messages in thread
From: Hans de Goede @ 2020-04-24 10:32 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Thomas Zimmermann, Nitin Joshi1,
	Rajat Jain, Mark Pearson, Benjamin Berg

Hi all,

On 4/24/20 11:08 AM, Pekka Paalanen wrote:
> On Fri, 24 Apr 2020 10:24:31 +0200
> Hans de Goede <hdegoede@redhat.com> wrote:

<snip>

>> Agree on the hw-state prop reflecting the actual hardware state at
>> all times, that one is easy.
>>
>>> However, when userspace sets "privacy-screen-sw-state", the driver
>>> should attempt to change hardware state regardless of whether the
>>> "privacy-screen-sw-state" value changes compared to its old value or
>>> not. Otherwise userspace cannot intentionally override a hardware
>>> hotkey setting if possible (or would need two atomic commits to do it).
>>
>> Ack / agreed.
>>
>>> Mind, the above paragraph is only what I interpreted from this email
>>> thread here. Previously I did not think that with atomic KMS, setting a
>>> property to a value it already has could trigger anything. But I guess
>>> it can?
>>
>> In a way. My idea for the "privacy-screen-sw-state" is for it to reflect
>> the last requested value, where the request could come from either a
>> firmware controlled hotkey; or from userspace (this seems to be where
>> our ideas of how to handle this diverts).
>>
>> So what can happen is (with both props being always in sync)
>> -userspace reads privacy screen being off
>> -user toggles privacy screen on through firmware controlled hotkey
>> -kernel gets notified about state toggle, updates both property
>>    states to on
>> -userspace commits its old knowledge of the property (off), thereby
>>    triggering the kernel to turn the privacy screen back off
>>
>> So in this case from the kernel pov the property is actually set
>> to a new value, not to "a value it already has".
> 
> Hi,
> 
> that is an interesting point of view.
> 
> You are keeping the separation between "wanted" and "actual" state, but
> counting firmware/hardware hotkeys as "want" instead of letting them
> silently change hardware state.

Right, that seems more natural to me, as mentioned
already, this way the wanted and hw state only get out
of sync if the hw is locked to a certain state.

> That seems ok.
> 
>> Note there can be races here of course, but lets ignore those (for now).
>> Both the hotkey event as well as userspace changing the setting will be
>> end-user triggered events and will not happen at high frequency.
>> Also I see no way to completely eliminate racing here. Luckily the side
>> effects of the race or pretty harmless (annoying maybe, but not causing
>> crashes, etc).
>>
>>> This design is based on that it can.
>>>    
>>>> What is not clear to me is if any change to"privacy-screen-hw-state"
>>>> shall be propagated to "privacy-screen-sw-state"?
>>>>    - If yes, then I think we are not solving any problems of single property.
>>>>    - If no, then why do we require userspace to write to sw state only
>>>> if something has changed?
>>>
>>> No. As already written, the kernel must not change the value of
>>> "privacy-screen-sw-state", only userspace can.
>>
>> So this is where out view of how to handle this differs, I do
>> not see the hotkey changing the state as different from userspace
>> changing it. The reason for me to have both a sw- and a hw-state
>> is in case there is a physical switch (typically a slider style
>> switch) which forces the state to on / off. In this case userspace
>> could still set the "privacy-screen-sw-state" prop and then
>> the 2 could differ.
> 
> Yes, the locked switch case definitely makes sense to me.
> 
> If userspace has to avoid setting the sw property unless it actually
> intends to change it, then the sw property being controlled from
> multiple sources (firmware, hotkey, the /proc file below) could work.
> It would even tell the KMS client when someone else changed the
> "wanted" state.

Right, that is the idea and telling the KMS client definitely
is a feature we want, so that we can show an OSD notifcation
on the firmware handled hotkey presses, like we already do for volume
control/mute, (kbd) backlight changes, etc.

>> Lets add one more complication to this, which I think helps.
>> Currently the thinkpad_acpi driver exports the privacy screen as:
>>
>> /proc/acpi/ibm/lcdshadow
>>
>> Userspace can write this and then change the privacy-screen
>> setting, this is in shipped kernels and cannot be dropped
>> because it is part if the kernel's uABI now.
>> This means that another userspace process can change the
>> property underneath a kms client. I do not see how this is
>> different from the firmware changing the setting based on
>> a hotkey press. Yet if we stick with your "only userspace can"
>> change the sw-state setting, then does this count as userspace,
>> or do you mean only a kms client can ?  And then how is
>> another kms-client changing the setting different ?
> 
> To me that would be similar to firmware changing hardware state: it's
> not the KMS client (the display server) doing it, but something else
> behind its back while it thinks it's in full control.
> 
> Doing things behind the display server's back is what creates all the
> mess here.

Right, unfortunately this is not something which we can change.
I have asked Lenovo if it will be possible to just have the hotkey
send keypresses and let userspace deal with it. This might happen
on future hardware generations. But then again it might not and
for the current hardware gen. we are stuck with the firmware
handling it.

> Another KMS client cannot set the property behind the display server's
> back, because the display server is holding DRM master, and the
> property cannot be written without DRM master status. If the display
> server drops DRM master, it knows it probably lost all state.

True.

>> So to me to avoid confusion the only valid case where the
>> hw- and sw-state can differ is if userspace requests
>> say "off" as state while the privacy screen is forced on
>> by say a physical switch (or e.g. a BIOS option to lock it?).
>>
>> Then we would remember the off in sw-state but hw-state would
>> still be on.
>>
>> I guess that maybe for the enum of the hw-state we need 4 values
>> instead of 2:
>>
>> Enabled
>> Disabled
>> Enabled, locked
>> Disabled, locked
>>
>> To indicate to userspace that atm the state cannot be changed.
> 
> If userspace needs that information, sure.

Yes I believe that userspace will want to know when the state is locked.
Currently when wifi is disable through a hardware switch, like
the sliders on the side of (usually somewhat older) laptops, then e.g.
GNOME will show "Disabled in hardware" (the wording could do with
some work) at the place where you would normally run wifi on/off and
select the network to use.

> This makes me think that a driver needs to handle different types of
> switches/hotkeys through different properties:
> 
> - For a hardware latching switch that forces the shield state to be one
>    and not the other, it should change the hw-state but it does not seem
>    to have a reason to change the sw-state property: it's not a "want",
>    it's a hard setting. Changing sw-state would also lose the userspace
>    preference of the setting.

Yes, I completely agree.

> - For momentary button or a hotkey that is supposed to just toggle the
>    shield state, it would toggle sw-state property since it's a "want"
>    that can be overridden. Setting the property leads to changing the
>    hw-state as well (if not locked).

Also agreed, although in practice the firmware may actually change
the hw state itself (if not locked), but we would then update both the sw
and hw props to reflect this.

> Does that make sense?

Yes this makes sense to me. Now we just need to write it all down in
such a way so that other people will also understand it :)

> Maybe this is the best compromise given the display server cannot be in
> full control.

I agree that this seems to be the best compromise.

>> If userspace then still changes sw-state we cache it and apply
>> this if the privacy screen control gets unlocked.
> 
> Sounds good.
> 
>> On hardware where there is no "lock" the 2 properties will simply
>> always be the same.
> 
> Ok.

Regards,

Hans

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-24  8:32                   ` Pekka Paalanen
@ 2020-04-28 14:51                     ` Daniel Vetter
  2020-04-29 10:07                       ` Pekka Paalanen
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel Vetter @ 2020-04-28 14:51 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Thomas Zimmermann, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Nitin Joshi1,
	Rajat Jain, Mark Pearson, Benjamin Berg

On Fri, Apr 24, 2020 at 11:32:16AM +0300, Pekka Paalanen wrote:
> On Thu, 23 Apr 2020 17:01:49 +0200
> Daniel Vetter <daniel@ffwll.ch> wrote:
> 
> > On Tue, Apr 21, 2020 at 4:33 PM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> > >
> > > On Tue, 21 Apr 2020 14:15:52 +0200
> > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > >  
> 
> ...
> 
> > > > Note that the kernel isn't entire consistent on this. I've looked a bit
> > > > more closely at stuff. Ignoring content protection I've found following
> > > > approaches things:
> > > >
> > > > - self refresh helpers, which are entirely transparent. Therefore we do a
> > > >   hack to set allow_modeset when the self-refresh helpers need to do a
> > > >   modeset, to avoid total surprise for userspace. I think this is only ok
> > > >   for these kind of behind-the-scenes helpers like self-refresh.
> > > >
> > > > - link-status is always reset to "good" when you include any connector,
> > > >   which might force a modeset. Even when allow_modeset isn't set by
> > > >   userspace. Maybe we should fix that, but we've discussed forever how to
> > > >   make sure a bad link isn't ever stuck at "bad" for old userspace, so
> > > >   we've gone with this. But maybe limiting to only allow_modeset cases
> > > >   would also work.  
> > >
> > > Wait, what do you mean "include any connector"?
> > >
> > > What exactly could cause a modeset instead of failure when
> > > ALLOW_MODESET is not set?  
> > 
> > If you do an atomic commit with the connector included that has the
> > bad link status, then it'll reset it to Good to try to get a full
> > modeset to restore stuff. If you don't also have ALLOW_MODESET set,
> > it'll fail and userspace might be sad and not understand what's going
> > on. We can easily fix this by only forcing the link training to be
> > fixed if userspace has set ALLOW_MODESET.
> 
> Hi,
> 
> oh, that's ok, the fail case is there like it should.
> 
> It sounded like even if userspace does not set ALLOW_MODESET, the
> kernel would still do a modeset in come cases. I'm happy to have
> misunderstood.

Well currently that can go wrong, if you include a connector and a
link-status is bad. But could be fixed fairly easily.

> > > Does that mean that I'll never need to implement link-status handling
> > > in Weston, because the kernel will recover the link anyway? If the
> > > kernel does that, then what's the point of having a link-status
> > > property to begin with?  
> > 
> > Well generally all your compositor does all day long is flip buffers.
> > So you'll never get the kernel into restoring the link. Hence the
> > uevent, so that the compositor can a) update the mode list, which
> > might have changed b) do the modeset to restore stuff. The
> > auto-fallback is so that stuff like users manually
> > disabling/re-enabling an output actually helps with fixing stuff.
> 
> Ok, that's fine.
> 
> If link goes bad while Weston does not implement link-status, I'm
> totally happy to degree any fallout from that to be a Weston bug. It
> has never worked (right?), so it cannot be a kernel regression. It is
> quite important to me to be able to say to that a failure is a Weston
> bug, not a kernel regression, to not piss on the kernel devs.
> 
> > > > - I guess we could do stuff that only fires off when allow_modeset is set,
> > > >   but I haven't found some examples. I thought we've had some outside of
> > > >   self-refresh helpers already. The closest I've found is again
> > > >   link-status, which never allows userspace to set BAD, and silently
> > > >   upgrades to GOOD. So that userspace doing a blind safe/restore can't
> > > >   wreak stuff permanently.  
> > >
> > > Sounds like link-status was designed with a blind save/restore in mind.  
> > 
> > Yeah that part we didn't screw up.
> > 
> > > > It's all a bit nasty :-/
> > > >
> > > > I think we should at least allow userspace to do a blind restore with
> > > > allow_modeset and not expect bad side-effects. That would mean fixing at
> > > > least the content protection stuff.
> > > >
> > > > Plus documenting this in the kernel somewhere. As the official thing to
> > > > do. But maybe we want some actual userspace doing this before we enshrine
> > > > it as official policy. The content protection fix is a one-liner and can
> > > > be cc'ed stable.  
> > >
> > > I'd probably not go there, a blind save does not guarantee a good
> > > state. The fix to "Content Protection" is not necessary (as long as
> > > userspace does not do a blind save/restore) if we can get the default
> > > state from the kernel. If we get the default state from the kernel,
> > > then userspace would be doing a blind restore but not save, meaning
> > > that the state actually is sane and writable.
> > >
> > > I'd love to volunteer for writing the Weston code to make use of "get me
> > > sane default state" UAPI, but I'm afraid I'm not in that much control
> > > of my time.  
> > 
> > The problem is, what is your default state? Driver defaults (generally
> > fairly random and mostly everything is off)? After fbcon has done
> > it's, which might never happen when you've disabling fbdev/fbcon?
> > Boot-up state from the firmware for drivers like i915 that support
> > fastboot (and what if that's garbage)? These can all be different too.
> 
> I believe what I want is more or less the driver defaults, or DRM core
> defaults for common KMS properties. It does not matter if they are
> arbitrary, as long as they remain the same across kernel versions. They
> need to be constants, so that I can rely on always getting to the same
> state on the same piece of hardware when I use the "default state"
> regardless of what might be happening software wise.
> 
> But of course the defaults must be legal values and they should make
> some sense. I'd consider "neutral", "identity", and "off" to be
> sensible values, but what that means for each property depends on the
> exact property. So there may be work to be done to unify the default
> values across drivers for driver-specific properties.
> 
> Having stuff off by default is ok. If that makes things not work, then
> we'll fix userspace to understand the new properties that are
> necessary. At no point it can be said to be a kernel regression - at
> least I very much hope and intend so.
> 
> There is always the problem of someone using new userspace first on old
> kernel that does not expose "default state" so userspace doesn't use
> "default state". Then upgrading to new kernel that has "default state"
> that is not good for making things work, but userspace starts using it
> because it becomes available and fails. Is there any way we can stop
> that from being seen as a kernel regression?

If you upgrade the kernel and existing userspace breaks, it's a
regression.

There's no wiggle room in this, the only wiggle room is that sometimes
people don't report issues, and eventually all affected hw is dead.

> Fbcon has exactly the problem that fbcon might not exist, so we cannot
> rely on that. Is fbcon even reliable enough to offer constant defaults
> that could be relied upon to stay the same across kernel versions and
> software configurations?

It's getting hacked on quite a bit, in a fairly ad-hoc fashion. So
"defaults as fbcon sets them" is something that's under fair bit of
change.

> State from firmware I would not trust at all, who knows when a firmware
> upgrade changes things.
> 
> What I'm saying is, the "default state" would very much be UABI. Not
> just how to access it, but the exact values of it for all existing
> properties. The important invariant is: for the same piece of hardware,
> I always get the same defaults (if any), regardless of any software or
> firmware versions.

Yup, that's why default state exposed by the kernel is such a pain: It's
uapi. If we change it, we can break something somewhere. And the problem
is complex enough that there's going to be lots of opinions about what
the default should be.

fbcon is a lot easier here, since there there's at least some agreement
that a working console should be visible and preferrably right side up and
all that.

> This invariant should guarantee that e.g. measured monitor color
> profiles stay valid across kernel upgrades. I know, people who actually
> care about color do not trust even reboots, but if we can let them get
> away with just verifying the profile instead of wasting hours in
> re-measuring the profile, that would be good.
> 
> When a driver starts supporting a new property, it must set the default
> value of the property to the value it has been using implicitly before.
> But if that implicit value was not hardcoded and depends on variable
> factors, then we probably cannot avoid a change in behaviour when
> introducing the property or "default values".

The problem is in figuring out that implicit value. In some cases I think
it's even "whatever the hw booted with", but I've tried to handle these
kinds of issues with atomic at least.

> > The hard part isn't the uapi, it's the implementation and defining
> > what you mean. Generally everyone wants their own version of default
> > value semantics, so this a) encodes policy, which we try to avoid in
> > the kernel and b) will be epic amounts of endless bikeshedding and
> > fighting between use-cases.
> 
> It's a policy, yes. But we need a single source of arbitrary default
> values, so that we have a strategy for achieving the same hardware
> state even when userspace does not understand all KMS properties:
> 
> - after losing and re-gaining DRM master status (e.g. temporary
>   VT-switch to another KMS program that modifies state I don't
>   understand), and
> 
> - across changes in the software stack.
> 
> The former is the most important point, the latter would be good to
> have.

But you can fix the above use-case also with save&restore. Which we're
trying to support (but even that has bugs).

> We already have a policy in the kernel: fbcon KMS state. If everyone
> can agree to that state, it's fine by me, but it also needs to be
> available when fbcon support is disabled. Maybe this could be a path
> forward? Expose what state fbcon would use, regardless of whether fbcon
> exists. It would probably miss the latter point, though, but that could
> be acceptable.
> 
> If we could trust that the KMS state is "sane" when any KMS program
> starts, then this problem doesn't exist: just read out the KMS state on
> start-up and use that. But since we don't know where the existing KMS
> state comes from, it could be anything. And for smooth hand-off between
> display servers, we can't have per DRM file description KMS state
> either.

You can assume that boot-up state is about as "sane" as it gets. So a
system compositor (in logind or whatever, it already deals with this
stuff) can reset the screen.

For everything else I'm not sure kernel defaults actually fixes anything.
It just gives everyont the illusion that now the problem is fixed, except
the fix is now kernel uapi, which means we can never change it.

The upshot of doing the exact same as a logind request along the lines of
"pls reset display to sane state, thx" is that logind can be changed much
easier than kernel uapi in case we get this wrong. Which we will.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-28 14:51                     ` Daniel Vetter
@ 2020-04-29 10:07                       ` Pekka Paalanen
  2020-04-30 13:53                         ` Daniel Vetter
  0 siblings, 1 reply; 54+ messages in thread
From: Pekka Paalanen @ 2020-04-29 10:07 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Benjamin Berg, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson


[-- Attachment #1.1: Type: text/plain, Size: 15244 bytes --]

On Tue, 28 Apr 2020 16:51:57 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:

> On Fri, Apr 24, 2020 at 11:32:16AM +0300, Pekka Paalanen wrote:
> > On Thu, 23 Apr 2020 17:01:49 +0200
> > Daniel Vetter <daniel@ffwll.ch> wrote:
> >   
> > > On Tue, Apr 21, 2020 at 4:33 PM Pekka Paalanen <ppaalanen@gmail.com> wrote:  
> > > >
> > > > On Tue, 21 Apr 2020 14:15:52 +0200
> > > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >    
> > 
> > ...

Hi,

please skip to the TL;DR at the bottom, the rest is just how I ended up
there.

> >   
> > > > > Note that the kernel isn't entire consistent on this. I've looked a bit
> > > > > more closely at stuff. Ignoring content protection I've found following
> > > > > approaches things:
> > > > >
> > > > > - self refresh helpers, which are entirely transparent. Therefore we do a
> > > > >   hack to set allow_modeset when the self-refresh helpers need to do a
> > > > >   modeset, to avoid total surprise for userspace. I think this is only ok
> > > > >   for these kind of behind-the-scenes helpers like self-refresh.
> > > > >
> > > > > - link-status is always reset to "good" when you include any connector,
> > > > >   which might force a modeset. Even when allow_modeset isn't set by
> > > > >   userspace. Maybe we should fix that, but we've discussed forever how to
> > > > >   make sure a bad link isn't ever stuck at "bad" for old userspace, so
> > > > >   we've gone with this. But maybe limiting to only allow_modeset cases
> > > > >   would also work.    
> > > >
> > > > Wait, what do you mean "include any connector"?
> > > >
> > > > What exactly could cause a modeset instead of failure when
> > > > ALLOW_MODESET is not set?    
> > > 
> > > If you do an atomic commit with the connector included that has the
> > > bad link status, then it'll reset it to Good to try to get a full
> > > modeset to restore stuff. If you don't also have ALLOW_MODESET set,
> > > it'll fail and userspace might be sad and not understand what's going
> > > on. We can easily fix this by only forcing the link training to be
> > > fixed if userspace has set ALLOW_MODESET.  
> > 
> > Hi,
> > 
> > oh, that's ok, the fail case is there like it should.
> > 
> > It sounded like even if userspace does not set ALLOW_MODESET, the
> > kernel would still do a modeset in come cases. I'm happy to have
> > misunderstood.  
> 
> Well currently that can go wrong, if you include a connector and a
> link-status is bad. But could be fixed fairly easily.

What do you mean by "include a connector"? Which property on which
object?

Weston always submits more or less full KMS state (known properties on
intended-active outputs) on all updates, on simple pageflips too, so I
am worried that the connector is "included", leading to automatic
papering over link-status failures, and Weston doesn't handle
link-status yet.

Weston does this, because it is the easy thing to do and debug. You
don't have to track back thousands of frames to figure out what the
mode or the connectors are, when something doesn't work like it should.
Or to figure out why some state wasn't emitted when it was supposed to.


> > > > I'd probably not go there, a blind save does not guarantee a good
> > > > state. The fix to "Content Protection" is not necessary (as long as
> > > > userspace does not do a blind save/restore) if we can get the default
> > > > state from the kernel. If we get the default state from the kernel,
> > > > then userspace would be doing a blind restore but not save, meaning
> > > > that the state actually is sane and writable.
> > > >
> > > > I'd love to volunteer for writing the Weston code to make use of "get me
> > > > sane default state" UAPI, but I'm afraid I'm not in that much control
> > > > of my time.    
> > > 
> > > The problem is, what is your default state? Driver defaults (generally
> > > fairly random and mostly everything is off)? After fbcon has done
> > > it's, which might never happen when you've disabling fbdev/fbcon?
> > > Boot-up state from the firmware for drivers like i915 that support
> > > fastboot (and what if that's garbage)? These can all be different too.  
> > 
> > I believe what I want is more or less the driver defaults, or DRM core
> > defaults for common KMS properties. It does not matter if they are
> > arbitrary, as long as they remain the same across kernel versions. They
> > need to be constants, so that I can rely on always getting to the same
> > state on the same piece of hardware when I use the "default state"
> > regardless of what might be happening software wise.
> > 
> > But of course the defaults must be legal values and they should make
> > some sense. I'd consider "neutral", "identity", and "off" to be
> > sensible values, but what that means for each property depends on the
> > exact property. So there may be work to be done to unify the default
> > values across drivers for driver-specific properties.
> > 
> > Having stuff off by default is ok. If that makes things not work, then
> > we'll fix userspace to understand the new properties that are
> > necessary. At no point it can be said to be a kernel regression - at
> > least I very much hope and intend so.
> > 
> > There is always the problem of someone using new userspace first on old
> > kernel that does not expose "default state" so userspace doesn't use
> > "default state". Then upgrading to new kernel that has "default state"
> > that is not good for making things work, but userspace starts using it
> > because it becomes available and fails. Is there any way we can stop
> > that from being seen as a kernel regression?  
> 
> If you upgrade the kernel and existing userspace breaks, it's a
> regression.
> 
> There's no wiggle room in this, the only wiggle room is that sometimes
> people don't report issues, and eventually all affected hw is dead.

Then the only possibility is that the transition from not using default
state to using default state does not change anything. Which means that
the default state must be the state the KMS program would normally
inherit, so, fbcon state?

> > Fbcon has exactly the problem that fbcon might not exist, so we cannot
> > rely on that. Is fbcon even reliable enough to offer constant defaults
> > that could be relied upon to stay the same across kernel versions and
> > software configurations?  
> 
> It's getting hacked on quite a bit, in a fairly ad-hoc fashion. So
> "defaults as fbcon sets them" is something that's under fair bit of
> change.

Yet, it seems to be the only "sane state" defined so far. Maybe we have
to live with it changing from boot to boot.

We also cannot ever have anything better, because of the above issue
"new userspace, old kernel, then upgrade kernel", can we? It doesn't
"break" userspace but it could be a regression for the end user,
because upgrading the kernel changes something that is visible and no
longer the same as before, e.g. breaking monitor calibration or profile
because suddenly some color related property is set differently.

When new KMS properties are introduced, how do people pick their
default values in the kernel when no userspace sets them yet? Are those
defaults immutable for eternity afterwards?

> > State from firmware I would not trust at all, who knows when a firmware
> > upgrade changes things.
> > 
> > What I'm saying is, the "default state" would very much be UABI. Not
> > just how to access it, but the exact values of it for all existing
> > properties. The important invariant is: for the same piece of hardware,
> > I always get the same defaults (if any), regardless of any software or
> > firmware versions.  
> 
> Yup, that's why default state exposed by the kernel is such a pain: It's
> uapi. If we change it, we can break something somewhere. And the problem
> is complex enough that there's going to be lots of opinions about what
> the default should be.

I think the current situation is just *hoping* that kernel doesn't
change its defaults when they are not even defined in any one place.
The defaults userspace currently gets in most cases is whatever it
inherits from fbcon. And you said fbcon changes ad-hoc.

> fbcon is a lot easier here, since there there's at least some agreement
> that a working console should be visible and preferrably right side up and
> all that.

Yeah, but what about gamma, color and other details that the console
usually doesn't care about as long as e.g. red is somewhat reddish?

I remember someone telling me, that fbcon does not reset gamma tables,
which means any gamma using KMS app can mess up the console and all
other KMS apps that don't use gamma.

> > This invariant should guarantee that e.g. measured monitor color
> > profiles stay valid across kernel upgrades. I know, people who actually
> > care about color do not trust even reboots, but if we can let them get
> > away with just verifying the profile instead of wasting hours in
> > re-measuring the profile, that would be good.
> > 
> > When a driver starts supporting a new property, it must set the default
> > value of the property to the value it has been using implicitly before.
> > But if that implicit value was not hardcoded and depends on variable
> > factors, then we probably cannot avoid a change in behaviour when
> > introducing the property or "default values".  
> 
> The problem is in figuring out that implicit value. In some cases I think
> it's even "whatever the hw booted with", but I've tried to handle these
> kinds of issues with atomic at least.
> 
> > > The hard part isn't the uapi, it's the implementation and defining
> > > what you mean. Generally everyone wants their own version of default
> > > value semantics, so this a) encodes policy, which we try to avoid in
> > > the kernel and b) will be epic amounts of endless bikeshedding and
> > > fighting between use-cases.  
> > 
> > It's a policy, yes. But we need a single source of arbitrary default
> > values, so that we have a strategy for achieving the same hardware
> > state even when userspace does not understand all KMS properties:
> > 
> > - after losing and re-gaining DRM master status (e.g. temporary
> >   VT-switch to another KMS program that modifies state I don't
> >   understand), and
> > 
> > - across changes in the software stack.
> > 
> > The former is the most important point, the latter would be good to
> > have.  
> 
> But you can fix the above use-case also with save&restore. Which we're
> trying to support (but even that has bugs).

Ok, so we're back to blind save & restore with undefined default state,
that depends not only on the hardware, the firmware, the kernel and
drivers, but also what was the previous KMS client doing before you
took over.

The only way to avoid undefined state is for all KMS programs to learn
to know all existing properties, and maybe warn the user if any
unrecognised properties appear. If KMS programs refuse to work when
unrecognised properties appear, that means you will never be able to
add anything to the properties again. Then you probably add caps for
new properties and we are back to square one except the UAPI is a lot
more complicated and userspace can longer warn the end user about
potentially bad KMS state.

Since the kernel refuses to provide a constant default state, maybe we
need a userspace library for that - one that *must* be updated for
every new property added to the kernel. Then we could circumvent the
kernel stable UAPI rules to provide predictable behaviour for userspace
that wants it.

Except, that too, would hit the "new userspace, old kernel, then
upgrade kernel" scenario where a kernel upgrade breaks userspace
because it allows userspace to start using new properties with legal
and working but different values from what it had before.

Are we really in a dead end with this?

> > We already have a policy in the kernel: fbcon KMS state. If everyone
> > can agree to that state, it's fine by me, but it also needs to be
> > available when fbcon support is disabled. Maybe this could be a path
> > forward? Expose what state fbcon would use, regardless of whether fbcon
> > exists. It would probably miss the latter point, though, but that could
> > be acceptable.
> > 
> > If we could trust that the KMS state is "sane" when any KMS program
> > starts, then this problem doesn't exist: just read out the KMS state on
> > start-up and use that. But since we don't know where the existing KMS
> > state comes from, it could be anything. And for smooth hand-off between
> > display servers, we can't have per DRM file description KMS state
> > either.  
> 
> You can assume that boot-up state is about as "sane" as it gets. So a
> system compositor (in logind or whatever, it already deals with this
> stuff) can reset the screen.

System compositors largely do not exist, the whole concept has been
dismissed. If logind would reset the screen, we would no longer have
seamless hand-off between any display servers and anything. Logind
could hand off the state for others to use as default state though.

How would one get the boot-up state?

Why do we need a userspace daemon to record KMS boot-up state instead
of the kernel giving it to userspace? How would that daemon itself get
the boot-up state if it has already been modified by fbcon?

> For everything else I'm not sure kernel defaults actually fixes anything.
> It just gives everyont the illusion that now the problem is fixed, except
> the fix is now kernel uapi, which means we can never change it.

If we want constant defaults, then by definition they cannot ever be
changed. It doesn't matter where it comes from.

I suppose the conclusion is that we cannot have constant defaults at
all. The only defaults we can have are undefined. If someone does not
like it, he needs to implement unrecognised properties in whatever
userspace he cares about.

> The upshot of doing the exact same as a logind request along the lines of
> "pls reset display to sane state, thx" is that logind can be changed much
> easier than kernel uapi in case we get this wrong. Which we will.

Sorry, I don't see the difference if the goal is to have fixed
defaults. The only difference with logind is that there is no angry
Linus.


TL;DR:

If blind save & restore (but restricted only to cases where KMS state
may have been lost!) works, it will ensure that a running userspace
program has what it was started with, but it does not guarantee that
the state is the same between restarts of the program.

It seems that the default state is always undefined, and cannot be
defined because whatever one defines might not always result in a
picture on screen. Bugs in defined default state cannot be fixed, by
definition as the default state is immutable once defined.

Ensuring consistent state between restarts is the responsibility of
userspace and each KMS program individually. They might attempt to
blindly save KMS state in a file to be re-used on the next start.


Thanks,
pq

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-29 10:07                       ` Pekka Paalanen
@ 2020-04-30 13:53                         ` Daniel Vetter
  2020-05-04  9:49                           ` Pekka Paalanen
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel Vetter @ 2020-04-30 13:53 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Thomas Zimmermann, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Nitin Joshi1,
	Rajat Jain, Mark Pearson, Benjamin Berg

On Wed, Apr 29, 2020 at 01:07:54PM +0300, Pekka Paalanen wrote:
> On Tue, 28 Apr 2020 16:51:57 +0200
> Daniel Vetter <daniel@ffwll.ch> wrote:
> 
> > On Fri, Apr 24, 2020 at 11:32:16AM +0300, Pekka Paalanen wrote:
> > > On Thu, 23 Apr 2020 17:01:49 +0200
> > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > >   
> > > > On Tue, Apr 21, 2020 at 4:33 PM Pekka Paalanen <ppaalanen@gmail.com> wrote:  
> > > > >
> > > > > On Tue, 21 Apr 2020 14:15:52 +0200
> > > > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >    
> > > 
> > > ...
> 
> Hi,
> 
> please skip to the TL;DR at the bottom, the rest is just how I ended up
> there.
> 
> > >   
> > > > > > Note that the kernel isn't entire consistent on this. I've looked a bit
> > > > > > more closely at stuff. Ignoring content protection I've found following
> > > > > > approaches things:
> > > > > >
> > > > > > - self refresh helpers, which are entirely transparent. Therefore we do a
> > > > > >   hack to set allow_modeset when the self-refresh helpers need to do a
> > > > > >   modeset, to avoid total surprise for userspace. I think this is only ok
> > > > > >   for these kind of behind-the-scenes helpers like self-refresh.
> > > > > >
> > > > > > - link-status is always reset to "good" when you include any connector,
> > > > > >   which might force a modeset. Even when allow_modeset isn't set by
> > > > > >   userspace. Maybe we should fix that, but we've discussed forever how to
> > > > > >   make sure a bad link isn't ever stuck at "bad" for old userspace, so
> > > > > >   we've gone with this. But maybe limiting to only allow_modeset cases
> > > > > >   would also work.    
> > > > >
> > > > > Wait, what do you mean "include any connector"?
> > > > >
> > > > > What exactly could cause a modeset instead of failure when
> > > > > ALLOW_MODESET is not set?    
> > > > 
> > > > If you do an atomic commit with the connector included that has the
> > > > bad link status, then it'll reset it to Good to try to get a full
> > > > modeset to restore stuff. If you don't also have ALLOW_MODESET set,
> > > > it'll fail and userspace might be sad and not understand what's going
> > > > on. We can easily fix this by only forcing the link training to be
> > > > fixed if userspace has set ALLOW_MODESET.  
> > > 
> > > Hi,
> > > 
> > > oh, that's ok, the fail case is there like it should.
> > > 
> > > It sounded like even if userspace does not set ALLOW_MODESET, the
> > > kernel would still do a modeset in come cases. I'm happy to have
> > > misunderstood.  
> > 
> > Well currently that can go wrong, if you include a connector and a
> > link-status is bad. But could be fixed fairly easily.
> 
> What do you mean by "include a connector"? Which property on which
> object?
> 
> Weston always submits more or less full KMS state (known properties on
> intended-active outputs) on all updates, on simple pageflips too, so I
> am worried that the connector is "included", leading to automatic
> papering over link-status failures, and Weston doesn't handle
> link-status yet.

Include a connector = you have a property for a connector included in your
atomic update. Sounds like you do that, so if you have a real world
link-status failure, then things go a bit boom already.

I guess we need a kernel patch to make sure link-status only gets fixed
when userspace is ok with a modeset.

> Weston does this, because it is the easy thing to do and debug. You
> don't have to track back thousands of frames to figure out what the
> mode or the connectors are, when something doesn't work like it should.
> Or to figure out why some state wasn't emitted when it was supposed to.
> 
> 
> > > > > I'd probably not go there, a blind save does not guarantee a good
> > > > > state. The fix to "Content Protection" is not necessary (as long as
> > > > > userspace does not do a blind save/restore) if we can get the default
> > > > > state from the kernel. If we get the default state from the kernel,
> > > > > then userspace would be doing a blind restore but not save, meaning
> > > > > that the state actually is sane and writable.
> > > > >
> > > > > I'd love to volunteer for writing the Weston code to make use of "get me
> > > > > sane default state" UAPI, but I'm afraid I'm not in that much control
> > > > > of my time.    
> > > > 
> > > > The problem is, what is your default state? Driver defaults (generally
> > > > fairly random and mostly everything is off)? After fbcon has done
> > > > it's, which might never happen when you've disabling fbdev/fbcon?
> > > > Boot-up state from the firmware for drivers like i915 that support
> > > > fastboot (and what if that's garbage)? These can all be different too.  
> > > 
> > > I believe what I want is more or less the driver defaults, or DRM core
> > > defaults for common KMS properties. It does not matter if they are
> > > arbitrary, as long as they remain the same across kernel versions. They
> > > need to be constants, so that I can rely on always getting to the same
> > > state on the same piece of hardware when I use the "default state"
> > > regardless of what might be happening software wise.
> > > 
> > > But of course the defaults must be legal values and they should make
> > > some sense. I'd consider "neutral", "identity", and "off" to be
> > > sensible values, but what that means for each property depends on the
> > > exact property. So there may be work to be done to unify the default
> > > values across drivers for driver-specific properties.
> > > 
> > > Having stuff off by default is ok. If that makes things not work, then
> > > we'll fix userspace to understand the new properties that are
> > > necessary. At no point it can be said to be a kernel regression - at
> > > least I very much hope and intend so.
> > > 
> > > There is always the problem of someone using new userspace first on old
> > > kernel that does not expose "default state" so userspace doesn't use
> > > "default state". Then upgrading to new kernel that has "default state"
> > > that is not good for making things work, but userspace starts using it
> > > because it becomes available and fails. Is there any way we can stop
> > > that from being seen as a kernel regression?  
> > 
> > If you upgrade the kernel and existing userspace breaks, it's a
> > regression.
> > 
> > There's no wiggle room in this, the only wiggle room is that sometimes
> > people don't report issues, and eventually all affected hw is dead.
> 
> Then the only possibility is that the transition from not using default
> state to using default state does not change anything. Which means that
> the default state must be the state the KMS program would normally
> inherit, so, fbcon state?
> 
> > > Fbcon has exactly the problem that fbcon might not exist, so we cannot
> > > rely on that. Is fbcon even reliable enough to offer constant defaults
> > > that could be relied upon to stay the same across kernel versions and
> > > software configurations?  
> > 
> > It's getting hacked on quite a bit, in a fairly ad-hoc fashion. So
> > "defaults as fbcon sets them" is something that's under fair bit of
> > change.
> 
> Yet, it seems to be the only "sane state" defined so far. Maybe we have
> to live with it changing from boot to boot.
> 
> We also cannot ever have anything better, because of the above issue
> "new userspace, old kernel, then upgrade kernel", can we? It doesn't
> "break" userspace but it could be a regression for the end user,
> because upgrading the kernel changes something that is visible and no
> longer the same as before, e.g. breaking monitor calibration or profile
> because suddenly some color related property is set differently.
> 
> When new KMS properties are introduced, how do people pick their
> default values in the kernel when no userspace sets them yet? Are those
> defaults immutable for eternity afterwards?
> 
> > > State from firmware I would not trust at all, who knows when a firmware
> > > upgrade changes things.
> > > 
> > > What I'm saying is, the "default state" would very much be UABI. Not
> > > just how to access it, but the exact values of it for all existing
> > > properties. The important invariant is: for the same piece of hardware,
> > > I always get the same defaults (if any), regardless of any software or
> > > firmware versions.  
> > 
> > Yup, that's why default state exposed by the kernel is such a pain: It's
> > uapi. If we change it, we can break something somewhere. And the problem
> > is complex enough that there's going to be lots of opinions about what
> > the default should be.
> 
> I think the current situation is just *hoping* that kernel doesn't
> change its defaults when they are not even defined in any one place.
> The defaults userspace currently gets in most cases is whatever it
> inherits from fbcon. And you said fbcon changes ad-hoc.
> 
> > fbcon is a lot easier here, since there there's at least some agreement
> > that a working console should be visible and preferrably right side up and
> > all that.
> 
> Yeah, but what about gamma, color and other details that the console
> usually doesn't care about as long as e.g. red is somewhat reddish?
> 
> I remember someone telling me, that fbcon does not reset gamma tables,
> which means any gamma using KMS app can mess up the console and all
> other KMS apps that don't use gamma.
> 
> > > This invariant should guarantee that e.g. measured monitor color
> > > profiles stay valid across kernel upgrades. I know, people who actually
> > > care about color do not trust even reboots, but if we can let them get
> > > away with just verifying the profile instead of wasting hours in
> > > re-measuring the profile, that would be good.
> > > 
> > > When a driver starts supporting a new property, it must set the default
> > > value of the property to the value it has been using implicitly before.
> > > But if that implicit value was not hardcoded and depends on variable
> > > factors, then we probably cannot avoid a change in behaviour when
> > > introducing the property or "default values".  
> > 
> > The problem is in figuring out that implicit value. In some cases I think
> > it's even "whatever the hw booted with", but I've tried to handle these
> > kinds of issues with atomic at least.
> > 
> > > > The hard part isn't the uapi, it's the implementation and defining
> > > > what you mean. Generally everyone wants their own version of default
> > > > value semantics, so this a) encodes policy, which we try to avoid in
> > > > the kernel and b) will be epic amounts of endless bikeshedding and
> > > > fighting between use-cases.  
> > > 
> > > It's a policy, yes. But we need a single source of arbitrary default
> > > values, so that we have a strategy for achieving the same hardware
> > > state even when userspace does not understand all KMS properties:
> > > 
> > > - after losing and re-gaining DRM master status (e.g. temporary
> > >   VT-switch to another KMS program that modifies state I don't
> > >   understand), and
> > > 
> > > - across changes in the software stack.
> > > 
> > > The former is the most important point, the latter would be good to
> > > have.  
> > 
> > But you can fix the above use-case also with save&restore. Which we're
> > trying to support (but even that has bugs).
> 
> Ok, so we're back to blind save & restore with undefined default state,
> that depends not only on the hardware, the firmware, the kernel and
> drivers, but also what was the previous KMS client doing before you
> took over.
> 
> The only way to avoid undefined state is for all KMS programs to learn
> to know all existing properties, and maybe warn the user if any
> unrecognised properties appear. If KMS programs refuse to work when
> unrecognised properties appear, that means you will never be able to
> add anything to the properties again. Then you probably add caps for
> new properties and we are back to square one except the UAPI is a lot
> more complicated and userspace can longer warn the end user about
> potentially bad KMS state.
> 
> Since the kernel refuses to provide a constant default state, maybe we
> need a userspace library for that - one that *must* be updated for
> every new property added to the kernel. Then we could circumvent the
> kernel stable UAPI rules to provide predictable behaviour for userspace
> that wants it.
> 
> Except, that too, would hit the "new userspace, old kernel, then
> upgrade kernel" scenario where a kernel upgrade breaks userspace
> because it allows userspace to start using new properties with legal
> and working but different values from what it had before.
> 
> Are we really in a dead end with this?

One idea I proposed on irc is that logind would capture the boot-up state,
and you'd do a loginctl reset-screen cmd to reset it if something broke
somewhere. logind already has the code for forced vt switching, that feels
like a reasonable extension of that.

Then you could pick between "smooth transitions" (probably best for users
only using 1 compositor) and "whack it back to boot-up state when
switching to a new compositor" (maybe best default if you run multiple
different compositors, logind could even do that automatically).

> > > We already have a policy in the kernel: fbcon KMS state. If everyone
> > > can agree to that state, it's fine by me, but it also needs to be
> > > available when fbcon support is disabled. Maybe this could be a path
> > > forward? Expose what state fbcon would use, regardless of whether fbcon
> > > exists. It would probably miss the latter point, though, but that could
> > > be acceptable.
> > > 
> > > If we could trust that the KMS state is "sane" when any KMS program
> > > starts, then this problem doesn't exist: just read out the KMS state on
> > > start-up and use that. But since we don't know where the existing KMS
> > > state comes from, it could be anything. And for smooth hand-off between
> > > display servers, we can't have per DRM file description KMS state
> > > either.  
> > 
> > You can assume that boot-up state is about as "sane" as it gets. So a
> > system compositor (in logind or whatever, it already deals with this
> > stuff) can reset the screen.
> 
> System compositors largely do not exist, the whole concept has been
> dismissed. If logind would reset the screen, we would no longer have
> seamless hand-off between any display servers and anything. Logind
> could hand off the state for others to use as default state though.
> 
> How would one get the boot-up state?
> 
> Why do we need a userspace daemon to record KMS boot-up state instead
> of the kernel giving it to userspace? How would that daemon itself get
> the boot-up state if it has already been modified by fbcon?
> 
> > For everything else I'm not sure kernel defaults actually fixes anything.
> > It just gives everyont the illusion that now the problem is fixed, except
> > the fix is now kernel uapi, which means we can never change it.
> 
> If we want constant defaults, then by definition they cannot ever be
> changed. It doesn't matter where it comes from.
> 
> I suppose the conclusion is that we cannot have constant defaults at
> all. The only defaults we can have are undefined. If someone does not
> like it, he needs to implement unrecognised properties in whatever
> userspace he cares about.

Or config file of some sort to overwrite it. Like the kernel provides for
configuring fbcon already (as cmdline options). We add a new one every
time someone screams loud enough.

> > The upshot of doing the exact same as a logind request along the lines of
> > "pls reset display to sane state, thx" is that logind can be changed much
> > easier than kernel uapi in case we get this wrong. Which we will.
> 
> Sorry, I don't see the difference if the goal is to have fixed
> defaults. The only difference with logind is that there is no angry
> Linus.

Oh it's the exact same problem, but more focused. E.g. we don't have to
worry about embedded stuff, because that's going to run one compositor
only. So entire class of regression reports goes away.

If we put the defaults into the kernel, then more people will use it (it's
there), so the wiggle room is substantially reduced once we do have
conflict demands.

logind with multiple different compositors is going to be used on
desktops/latops only, so some chances that the definition of what default
should be all match up.

Also no regression pretty much applies across the board, e.g. we don't
have modifiers enabled in many compositors by default yet because it
breaks stuff.

> TL;DR:
> 
> If blind save & restore (but restricted only to cases where KMS state
> may have been lost!) works, it will ensure that a running userspace
> program has what it was started with, but it does not guarantee that
> the state is the same between restarts of the program.
> 
> It seems that the default state is always undefined, and cannot be
> defined because whatever one defines might not always result in a
> picture on screen. Bugs in defined default state cannot be fixed, by
> definition as the default state is immutable once defined.
> 
> Ensuring consistent state between restarts is the responsibility of
> userspace and each KMS program individually. They might attempt to
> blindly save KMS state in a file to be re-used on the next start.

I disagree with "each KMS program individually". We already have defined
protocols for drm master switching: Either the cooperative vt-switch
dance, or the unpriviledge switch logind allows. We also have ad-hoc
definitions of smooth transitions, it's "one primary plane, untiled". Some
people broke that, and now we have getfb2 so you can smoothly transition
even with tiled buffers. Except not everyone supports that.

So there's already a lot of stuff userspace needs to agree on, or your
compositor switching is more or less broken. "reset to defaults if a
compositor left a mess behind" is just one more aspect. Expecting the
kernel to provide a magic solution in form of defaults isn't going to work
well, same way it doesn't really work for drm master switching, or smooth
transitions. Or anything else that comes up in the future between
compositors.

I think on irc I mentioned that step 1 to fix this is come up with a
userspace protocol between compositors (with or without logind, but really
I'm not aware of anything else than logind). Once we have that, we can
figure out where to store the defaults. Maybe the right answer then is
"store them in the kernel, hardcoded as uapi", maybe not. But trying to
find the answer to that before we have a clear protocol and
responsibilities is just not going to work.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-04-30 13:53                         ` Daniel Vetter
@ 2020-05-04  9:49                           ` Pekka Paalanen
  2020-05-04 11:00                             ` Daniel Vetter
  0 siblings, 1 reply; 54+ messages in thread
From: Pekka Paalanen @ 2020-05-04  9:49 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Benjamin Berg, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson


[-- Attachment #1.1: Type: text/plain, Size: 10207 bytes --]

On Thu, 30 Apr 2020 15:53:23 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:

> On Wed, Apr 29, 2020 at 01:07:54PM +0300, Pekka Paalanen wrote:
> > On Tue, 28 Apr 2020 16:51:57 +0200
> > Daniel Vetter <daniel@ffwll.ch> wrote:
> >   
> > > On Fri, Apr 24, 2020 at 11:32:16AM +0300, Pekka Paalanen wrote:  
> > > > On Thu, 23 Apr 2020 17:01:49 +0200
> > > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >     
> > > > > On Tue, Apr 21, 2020 at 4:33 PM Pekka Paalanen <ppaalanen@gmail.com> wrote:    
> > > > > >
> > > > > > On Tue, 21 Apr 2020 14:15:52 +0200
> > > > > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > >      
> > > > 
> > > > ...  
> > 
> > Hi,
> > 
> > please skip to the TL;DR at the bottom, the rest is just how I ended up
> > there.
> >   
> > > >     
> > > > > > > Note that the kernel isn't entire consistent on this. I've looked a bit
> > > > > > > more closely at stuff. Ignoring content protection I've found following
> > > > > > > approaches things:
> > > > > > >
> > > > > > > - self refresh helpers, which are entirely transparent. Therefore we do a
> > > > > > >   hack to set allow_modeset when the self-refresh helpers need to do a
> > > > > > >   modeset, to avoid total surprise for userspace. I think this is only ok
> > > > > > >   for these kind of behind-the-scenes helpers like self-refresh.
> > > > > > >
> > > > > > > - link-status is always reset to "good" when you include any connector,
> > > > > > >   which might force a modeset. Even when allow_modeset isn't set by
> > > > > > >   userspace. Maybe we should fix that, but we've discussed forever how to
> > > > > > >   make sure a bad link isn't ever stuck at "bad" for old userspace, so
> > > > > > >   we've gone with this. But maybe limiting to only allow_modeset cases
> > > > > > >   would also work.      
> > > > > >
> > > > > > Wait, what do you mean "include any connector"?
> > > > > >
> > > > > > What exactly could cause a modeset instead of failure when
> > > > > > ALLOW_MODESET is not set?      
> > > > > 
> > > > > If you do an atomic commit with the connector included that has the
> > > > > bad link status, then it'll reset it to Good to try to get a full
> > > > > modeset to restore stuff. If you don't also have ALLOW_MODESET set,
> > > > > it'll fail and userspace might be sad and not understand what's going
> > > > > on. We can easily fix this by only forcing the link training to be
> > > > > fixed if userspace has set ALLOW_MODESET.    
> > > > 
> > > > Hi,
> > > > 
> > > > oh, that's ok, the fail case is there like it should.
> > > > 
> > > > It sounded like even if userspace does not set ALLOW_MODESET, the
> > > > kernel would still do a modeset in come cases. I'm happy to have
> > > > misunderstood.    
> > > 
> > > Well currently that can go wrong, if you include a connector and a
> > > link-status is bad. But could be fixed fairly easily.  
> > 
> > What do you mean by "include a connector"? Which property on which
> > object?
> > 
> > Weston always submits more or less full KMS state (known properties on
> > intended-active outputs) on all updates, on simple pageflips too, so I
> > am worried that the connector is "included", leading to automatic
> > papering over link-status failures, and Weston doesn't handle
> > link-status yet.  
> 
> Include a connector = you have a property for a connector included in your
> atomic update. Sounds like you do that, so if you have a real world
> link-status failure, then things go a bit boom already.

Are you talking about setting a connector's property, like "CRTC_ID",
"Content Protection", "HDCP Content Type"? Weston sets all of those on
every update if they exist. Or is it about any property on any
connector, not just a specific property on the specific connector, or
any property on the specific connector?

Also CRTC properties "MODE_ID" and "ACTIVE" are set on every update,
modeset or not. Weston just trusts that no-op changes in state do not
require a modeset.

Is it not the opposite? If a link fails, then Weston will produce a
glitch when the kernel automatically re-trains the link, and then
everything continues happily?

> I guess we need a kernel patch to make sure link-status only gets fixed
> when userspace is ok with a modeset.

That would be nice, but would it not also mean the above Weston case
ends up with a permanent black screen instead of a temporary glitch?

Or is there a hotplug uevent at play here too?

> > Weston does this, because it is the easy thing to do and debug. You
> > don't have to track back thousands of frames to figure out what the
> > mode or the connectors are, when something doesn't work like it should.
> > Or to figure out why some state wasn't emitted when it was supposed to.
> > 

...

> One idea I proposed on irc is that logind would capture the boot-up state,
> and you'd do a loginctl reset-screen cmd to reset it if something broke
> somewhere. logind already has the code for forced vt switching, that feels
> like a reasonable extension of that.

Needing to run a command manually to restore the screen instead of
"simply" switching to the graphical login manager to re-gain user
control seems a bit difficult.

> Then you could pick between "smooth transitions" (probably best for users
> only using 1 compositor) and "whack it back to boot-up state when
> switching to a new compositor" (maybe best default if you run multiple
> different compositors, logind could even do that automatically).

I think even better would be for logind to serialize and hand out the
default state it saved, instead of applying it. Then the compositor who
takes over can integrate the default state into its own state and
inherited state to apply it all in one commit.

Doing so, the compositor could even do a TEST_ONLY commit without
ALLOW_MODESET to see if a smooth transition is possible. If not, it can
skip some smooth transition animation for instance.

> > > The upshot of doing the exact same as a logind request along the lines of
> > > "pls reset display to sane state, thx" is that logind can be changed much
> > > easier than kernel uapi in case we get this wrong. Which we will.  
> > 
> > Sorry, I don't see the difference if the goal is to have fixed
> > defaults. The only difference with logind is that there is no angry
> > Linus.  
> 
> Oh it's the exact same problem, but more focused. E.g. we don't have to
> worry about embedded stuff, because that's going to run one compositor
> only. So entire class of regression reports goes away.
> 
> If we put the defaults into the kernel, then more people will use it (it's
> there), so the wiggle room is substantially reduced once we do have
> conflict demands.
> 
> logind with multiple different compositors is going to be used on
> desktops/latops only, so some chances that the definition of what default
> should be all match up.
> 
> Also no regression pretty much applies across the board, e.g. we don't
> have modifiers enabled in many compositors by default yet because it
> breaks stuff.

I'm not sure I understand, but ok, I'll stop pushing for it.


> > TL;DR:
> > 
> > If blind save & restore (but restricted only to cases where KMS state
> > may have been lost!) works, it will ensure that a running userspace
> > program has what it was started with, but it does not guarantee that
> > the state is the same between restarts of the program.
> > 
> > It seems that the default state is always undefined, and cannot be
> > defined because whatever one defines might not always result in a
> > picture on screen. Bugs in defined default state cannot be fixed, by
> > definition as the default state is immutable once defined.
> > 
> > Ensuring consistent state between restarts is the responsibility of
> > userspace and each KMS program individually. They might attempt to
> > blindly save KMS state in a file to be re-used on the next start.  
> 
> I disagree with "each KMS program individually". We already have defined
> protocols for drm master switching: Either the cooperative vt-switch
> dance, or the unpriviledge switch logind allows. We also have ad-hoc
> definitions of smooth transitions, it's "one primary plane, untiled". Some
> people broke that, and now we have getfb2 so you can smoothly transition
> even with tiled buffers. Except not everyone supports that.

Switching is not the point with saving unrecognized KMS state into a
compositor-specific file. The point of the file is to re-create the
exact same KMS state after the compositor has been restarted,
regardless of what KMS program might have ran first.

Use case: when you have completed monitor color profiling, the profile
may become invalid as soon as anything affecting colors changes, e.g.
link bit depth. With all the new properties for color spaces, HDR etc.
the likelihood of an unrecognized property affecting colors is quite
high. Either the property must be reset to the value it had during
profiling, or the profile needs to be rejected. This is more about
rebooting the machine than it is about switching between display
servers.

If the compositor has such a file saved, it might as well use it also
for sanitation when switching from another display server.

This still does not handle the case where a kernel upgrade introduces a
new property, the graphical login manager sets the property altering
colors, and the session compositor you profiled does not recognize the
new property and also does not have it saved in the file because it
didn't exist last time. But maybe that's a corner-case enough, and
people who care will notice the profile is invalid when physically
measuring its validity before starting to work. But this could be seen
as a kernel regression too: you only had to upgrade the kernel. Logind
saving state before fbcon takes over could remedy this, if the session
compositor integrated the logind saved state on every start.

But again we would rely on kernel not changing the hardware state
between kernel versions before fbcon takes over.


Thanks,
pq

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-05-04  9:49                           ` Pekka Paalanen
@ 2020-05-04 11:00                             ` Daniel Vetter
  2020-05-04 12:22                               ` Pekka Paalanen
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel Vetter @ 2020-05-04 11:00 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Benjamin Berg, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson

On Mon, May 4, 2020 at 11:49 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
>
> On Thu, 30 Apr 2020 15:53:23 +0200
> Daniel Vetter <daniel@ffwll.ch> wrote:
>
> > On Wed, Apr 29, 2020 at 01:07:54PM +0300, Pekka Paalanen wrote:
> > > On Tue, 28 Apr 2020 16:51:57 +0200
> > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > >
> > > > On Fri, Apr 24, 2020 at 11:32:16AM +0300, Pekka Paalanen wrote:
> > > > > On Thu, 23 Apr 2020 17:01:49 +0200
> > > > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > >
> > > > > > On Tue, Apr 21, 2020 at 4:33 PM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> > > > > > >
> > > > > > > On Tue, 21 Apr 2020 14:15:52 +0200
> > > > > > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > >
> > > > >
> > > > > ...
> > >
> > > Hi,
> > >
> > > please skip to the TL;DR at the bottom, the rest is just how I ended up
> > > there.
> > >
> > > > >
> > > > > > > > Note that the kernel isn't entire consistent on this. I've looked a bit
> > > > > > > > more closely at stuff. Ignoring content protection I've found following
> > > > > > > > approaches things:
> > > > > > > >
> > > > > > > > - self refresh helpers, which are entirely transparent. Therefore we do a
> > > > > > > >   hack to set allow_modeset when the self-refresh helpers need to do a
> > > > > > > >   modeset, to avoid total surprise for userspace. I think this is only ok
> > > > > > > >   for these kind of behind-the-scenes helpers like self-refresh.
> > > > > > > >
> > > > > > > > - link-status is always reset to "good" when you include any connector,
> > > > > > > >   which might force a modeset. Even when allow_modeset isn't set by
> > > > > > > >   userspace. Maybe we should fix that, but we've discussed forever how to
> > > > > > > >   make sure a bad link isn't ever stuck at "bad" for old userspace, so
> > > > > > > >   we've gone with this. But maybe limiting to only allow_modeset cases
> > > > > > > >   would also work.
> > > > > > >
> > > > > > > Wait, what do you mean "include any connector"?
> > > > > > >
> > > > > > > What exactly could cause a modeset instead of failure when
> > > > > > > ALLOW_MODESET is not set?
> > > > > >
> > > > > > If you do an atomic commit with the connector included that has the
> > > > > > bad link status, then it'll reset it to Good to try to get a full
> > > > > > modeset to restore stuff. If you don't also have ALLOW_MODESET set,
> > > > > > it'll fail and userspace might be sad and not understand what's going
> > > > > > on. We can easily fix this by only forcing the link training to be
> > > > > > fixed if userspace has set ALLOW_MODESET.
> > > > >
> > > > > Hi,
> > > > >
> > > > > oh, that's ok, the fail case is there like it should.
> > > > >
> > > > > It sounded like even if userspace does not set ALLOW_MODESET, the
> > > > > kernel would still do a modeset in come cases. I'm happy to have
> > > > > misunderstood.
> > > >
> > > > Well currently that can go wrong, if you include a connector and a
> > > > link-status is bad. But could be fixed fairly easily.
> > >
> > > What do you mean by "include a connector"? Which property on which
> > > object?
> > >
> > > Weston always submits more or less full KMS state (known properties on
> > > intended-active outputs) on all updates, on simple pageflips too, so I
> > > am worried that the connector is "included", leading to automatic
> > > papering over link-status failures, and Weston doesn't handle
> > > link-status yet.
> >
> > Include a connector = you have a property for a connector included in your
> > atomic update. Sounds like you do that, so if you have a real world
> > link-status failure, then things go a bit boom already.
>
> Are you talking about setting a connector's property, like "CRTC_ID",
> "Content Protection", "HDCP Content Type"? Weston sets all of those on
> every update if they exist. Or is it about any property on any
> connector, not just a specific property on the specific connector, or
> any property on the specific connector?

Any property on the specific connector where link-status has gone "bad".

> Also CRTC properties "MODE_ID" and "ACTIVE" are set on every update,
> modeset or not. Weston just trusts that no-op changes in state do not
> require a modeset

Yup that's generally valid assumption, but also that's enough to hit
the link-status == "bad" forced modeset I think.

> Is it not the opposite? If a link fails, then Weston will produce a
> glitch when the kernel automatically re-trains the link, and then
> everything continues happily?

That's the soft link failure, where the kernel can recover on its own.
There's the harder one where the new link status is degraded, and the
old mode doesn't fit. Or the kernel just threw a fit and can't make
stuff work anymore, and waits for the next userspace commit. Iirc for
MST links we can't easily do this because locking, so gets pushed to a
full modeset commit. Or something like that.

> > I guess we need a kernel patch to make sure link-status only gets fixed
> > when userspace is ok with a modeset.
>
> That would be nice, but would it not also mean the above Weston case
> ends up with a permanent black screen instead of a temporary glitch?
>
> Or is there a hotplug uevent at play here too?

Yeah you get a hotplug uevent, so as long as you eventually process
that and do a full modeset should be ok.

> > > Weston does this, because it is the easy thing to do and debug. You
> > > don't have to track back thousands of frames to figure out what the
> > > mode or the connectors are, when something doesn't work like it should.
> > > Or to figure out why some state wasn't emitted when it was supposed to.
> > >
>
> ...
>
> > One idea I proposed on irc is that logind would capture the boot-up state,
> > and you'd do a loginctl reset-screen cmd to reset it if something broke
> > somewhere. logind already has the code for forced vt switching, that feels
> > like a reasonable extension of that.
>
> Needing to run a command manually to restore the screen instead of
> "simply" switching to the graphical login manager to re-gain user
> control seems a bit difficult.

But that requires mind reading skills. Computer can't tell the "pls
don't wreak stuff, I want smooth switching" from "pls wreak stuff,
because it's glitching" wish the user has when vt-switching.

> > Then you could pick between "smooth transitions" (probably best for users
> > only using 1 compositor) and "whack it back to boot-up state when
> > switching to a new compositor" (maybe best default if you run multiple
> > different compositors, logind could even do that automatically).
>
> I think even better would be for logind to serialize and hand out the
> default state it saved, instead of applying it. Then the compositor who
> takes over can integrate the default state into its own state and
> inherited state to apply it all in one commit.

Yeah that'd be a flavour of this userspace protocol that we need first
- I have no idea about compositor interactions in userspace and what
protocols they'd want, my input isn't going to be much use here I
think.

> Doing so, the compositor could even do a TEST_ONLY commit without
> ALLOW_MODESET to see if a smooth transition is possible. If not, it can
> skip some smooth transition animation for instance.

Generally that's not enough, most people want some blending. And for
blending, you need to be able to read&understand what's already there.
Hence getfb2 so you can get at modifiers.

> > > > The upshot of doing the exact same as a logind request along the lines of
> > > > "pls reset display to sane state, thx" is that logind can be changed much
> > > > easier than kernel uapi in case we get this wrong. Which we will.
> > >
> > > Sorry, I don't see the difference if the goal is to have fixed
> > > defaults. The only difference with logind is that there is no angry
> > > Linus.
> >
> > Oh it's the exact same problem, but more focused. E.g. we don't have to
> > worry about embedded stuff, because that's going to run one compositor
> > only. So entire class of regression reports goes away.
> >
> > If we put the defaults into the kernel, then more people will use it (it's
> > there), so the wiggle room is substantially reduced once we do have
> > conflict demands.
> >
> > logind with multiple different compositors is going to be used on
> > desktops/latops only, so some chances that the definition of what default
> > should be all match up.
> >
> > Also no regression pretty much applies across the board, e.g. we don't
> > have modifiers enabled in many compositors by default yet because it
> > breaks stuff.
>
> I'm not sure I understand, but ok, I'll stop pushing for it.

Maybe to clarify: I'm not against adding something to solve this to
the kernel, period. If logind (or whatever) implements a proper
compositor switching protocol for kms state (instead of the current
"whatever, maybe glitches" approach), and it turns out the kernels
needs something to make that work, we can add it. But adding default
state to the kernel just so compositors can continue to not coordinate
how vt switching and all that is supposed to work without glitches and
leftover planes/cursors or whatever, then no, I don't like that idea.
Because I don't think it'll really work. If that means vt switching is
only glitch-free and can be reset to a working state if a compositor
crashed if you have logind working, *shrug*.

> > > TL;DR:
> > >
> > > If blind save & restore (but restricted only to cases where KMS state
> > > may have been lost!) works, it will ensure that a running userspace
> > > program has what it was started with, but it does not guarantee that
> > > the state is the same between restarts of the program.
> > >
> > > It seems that the default state is always undefined, and cannot be
> > > defined because whatever one defines might not always result in a
> > > picture on screen. Bugs in defined default state cannot be fixed, by
> > > definition as the default state is immutable once defined.
> > >
> > > Ensuring consistent state between restarts is the responsibility of
> > > userspace and each KMS program individually. They might attempt to
> > > blindly save KMS state in a file to be re-used on the next start.
> >
> > I disagree with "each KMS program individually". We already have defined
> > protocols for drm master switching: Either the cooperative vt-switch
> > dance, or the unpriviledge switch logind allows. We also have ad-hoc
> > definitions of smooth transitions, it's "one primary plane, untiled". Some
> > people broke that, and now we have getfb2 so you can smoothly transition
> > even with tiled buffers. Except not everyone supports that.
>
> Switching is not the point with saving unrecognized KMS state into a
> compositor-specific file. The point of the file is to re-create the
> exact same KMS state after the compositor has been restarted,
> regardless of what KMS program might have ran first.

This example isn't about when you can capture the "default state",
saving on every vt switch doesn't work. This was another example to
illustrate that smooth, glitch-free vt switching without coordination
between compositors (and enforced by logind) doesn't work. Some
compositors leave a tiled buffer behind, if the new compositor doesn't
implement getfb2 then you get ugly artifacts when trying to blend to
the new screen.

> Use case: when you have completed monitor color profiling, the profile
> may become invalid as soon as anything affecting colors changes, e.g.
> link bit depth. With all the new properties for color spaces, HDR etc.
> the likelihood of an unrecognized property affecting colors is quite
> high. Either the property must be reset to the value it had during
> profiling, or the profile needs to be rejected. This is more about
> rebooting the machine than it is about switching between display
> servers.
>
> If the compositor has such a file saved, it might as well use it also
> for sanitation when switching from another display server.
>
> This still does not handle the case where a kernel upgrade introduces a
> new property, the graphical login manager sets the property altering
> colors, and the session compositor you profiled does not recognize the
> new property and also does not have it saved in the file because it
> didn't exist last time. But maybe that's a corner-case enough, and
> people who care will notice the profile is invalid when physically
> measuring its validity before starting to work. But this could be seen
> as a kernel regression too: you only had to upgrade the kernel. Logind
> saving state before fbcon takes over could remedy this, if the session
> compositor integrated the logind saved state on every start.
>
> But again we would rely on kernel not changing the hardware state
> between kernel versions before fbcon takes over.

I guess my point is, this is a lot bigger problem than just the
default state. This = vt switching that doesn't look horrible and
doesn't result in artifacts and issues on the new compositor.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-05-04 11:00                             ` Daniel Vetter
@ 2020-05-04 12:22                               ` Pekka Paalanen
  2020-05-05  8:48                                 ` Daniel Vetter
  0 siblings, 1 reply; 54+ messages in thread
From: Pekka Paalanen @ 2020-05-04 12:22 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Benjamin Berg, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson


[-- Attachment #1.1: Type: text/plain, Size: 12755 bytes --]

On Mon, 4 May 2020 13:00:02 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:

> On Mon, May 4, 2020 at 11:49 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> >
> > On Thu, 30 Apr 2020 15:53:23 +0200
> > Daniel Vetter <daniel@ffwll.ch> wrote:
> >  
> > > On Wed, Apr 29, 2020 at 01:07:54PM +0300, Pekka Paalanen wrote:  
> > > > On Tue, 28 Apr 2020 16:51:57 +0200
> > > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >  
> > > > > On Fri, Apr 24, 2020 at 11:32:16AM +0300, Pekka Paalanen wrote:  
> > > > > > On Thu, 23 Apr 2020 17:01:49 +0200
> > > > > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > >  
> > > > > > > On Tue, Apr 21, 2020 at 4:33 PM Pekka Paalanen <ppaalanen@gmail.com> wrote:  
> > > > > > > >
> > > > > > > > On Tue, 21 Apr 2020 14:15:52 +0200
> > > > > > > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > > > > >  
> > > > > >

> > > Include a connector = you have a property for a connector included in your
> > > atomic update. Sounds like you do that, so if you have a real world
> > > link-status failure, then things go a bit boom already.  
> >
> > Are you talking about setting a connector's property, like "CRTC_ID",
> > "Content Protection", "HDCP Content Type"? Weston sets all of those on
> > every update if they exist. Or is it about any property on any
> > connector, not just a specific property on the specific connector, or
> > any property on the specific connector?  
> 
> Any property on the specific connector where link-status has gone "bad".

Ok, now I got it clearly. Thanks.

> > Also CRTC properties "MODE_ID" and "ACTIVE" are set on every update,
> > modeset or not. Weston just trusts that no-op changes in state do not
> > require a modeset  
> 
> Yup that's generally valid assumption, but also that's enough to hit
> the link-status == "bad" forced modeset I think.
> 
> > Is it not the opposite? If a link fails, then Weston will produce a
> > glitch when the kernel automatically re-trains the link, and then
> > everything continues happily?  
> 
> That's the soft link failure, where the kernel can recover on its own.
> There's the harder one where the new link status is degraded, and the
> old mode doesn't fit. Or the kernel just threw a fit and can't make
> stuff work anymore, and waits for the next userspace commit. Iirc for
> MST links we can't easily do this because locking, so gets pushed to a
> full modeset commit. Or something like that.
> 
> > > I guess we need a kernel patch to make sure link-status only gets fixed
> > > when userspace is ok with a modeset.  
> >
> > That would be nice, but would it not also mean the above Weston case
> > ends up with a permanent black screen instead of a temporary glitch?
> >
> > Or is there a hotplug uevent at play here too?  
> 
> Yeah you get a hotplug uevent, so as long as you eventually process
> that and do a full modeset should be ok.

I suppose Weston falls short there. Getting a hotplug event does not
force a modeset on all active CRTCs. Weston does re-read resources and
properties, but it does not trigger anything like re-picking the video
mode if the connection status and EDID remains the same.

I hope we can say Weston never worked with link failures to begin with,
and fix things properly through the link status property in Weston.

> > > > Weston does this, because it is the easy thing to do and debug. You
> > > > don't have to track back thousands of frames to figure out what the
> > > > mode or the connectors are, when something doesn't work like it should.
> > > > Or to figure out why some state wasn't emitted when it was supposed to.
> > > >  
> >
> > ...
> >  
> > > One idea I proposed on irc is that logind would capture the boot-up state,
> > > and you'd do a loginctl reset-screen cmd to reset it if something broke
> > > somewhere. logind already has the code for forced vt switching, that feels
> > > like a reasonable extension of that.  
> >
> > Needing to run a command manually to restore the screen instead of
> > "simply" switching to the graphical login manager to re-gain user
> > control seems a bit difficult.  
> 
> But that requires mind reading skills. Computer can't tell the "pls
> don't wreak stuff, I want smooth switching" from "pls wreak stuff,
> because it's glitching" wish the user has when vt-switching.

No, it doesn't. The graphical login manager already saved its default
state and it will always reinstate its default state. The session
compositor hopefully inherits the same or at least compatible default
state from the login manager, so the session compositor has it too. So
smooth hand-over is always possible unless the session compositor goes
out of its way to set otherwise. If the session compositor screws
something up like a gamma table, the login manager's default state will
just overwrite the bad data anyway. Whether that becomes a smooth
transition or not depends on whether the reset needs a modeset and if
the old FB is understandable.

The session will be permanently broken, as it will re-break when
switching back to it, but at least the login manager remains always
accessible.

> > > Then you could pick between "smooth transitions" (probably best for users
> > > only using 1 compositor) and "whack it back to boot-up state when
> > > switching to a new compositor" (maybe best default if you run multiple
> > > different compositors, logind could even do that automatically).  
> >
> > I think even better would be for logind to serialize and hand out the
> > default state it saved, instead of applying it. Then the compositor who
> > takes over can integrate the default state into its own state and
> > inherited state to apply it all in one commit.  
> 
> Yeah that'd be a flavour of this userspace protocol that we need first
> - I have no idea about compositor interactions in userspace and what
> protocols they'd want, my input isn't going to be much use here I
> think.
> 
> > Doing so, the compositor could even do a TEST_ONLY commit without
> > ALLOW_MODESET to see if a smooth transition is possible. If not, it can
> > skip some smooth transition animation for instance.  
> 
> Generally that's not enough, most people want some blending. And for
> blending, you need to be able to read&understand what's already there.
> Hence getfb2 so you can get at modifiers.

I assumed a smooth transition does not make sense if a modeset is
required. TEST_ONLY is useful for checking that.

...

> Maybe to clarify: I'm not against adding something to solve this to
> the kernel, period. If logind (or whatever) implements a proper
> compositor switching protocol for kms state (instead of the current
> "whatever, maybe glitches" approach), and it turns out the kernels
> needs something to make that work, we can add it. But adding default
> state to the kernel just so compositors can continue to not coordinate
> how vt switching and all that is supposed to work without glitches and
> leftover planes/cursors or whatever, then no, I don't like that idea.
> Because I don't think it'll really work. If that means vt switching is
> only glitch-free and can be reset to a working state if a compositor
> crashed if you have logind working, *shrug*.

We don't actually need new kernel UAPI for doing the non-cooperative
ignorant design. We can just do a blind save & restore to get that. (By
"blind", I have always meant "including properties you do not
understand"; not that you would use the state set as a whole without
merging it with other inherited or wanted state which you do
understand.)

What I was after with the kernel-hardcoded defaults is keeping the same
state across restarts and reboots, even when the kernel gains new KMS
properties. After all, the kernel drivers are the only thing that could
know what the value in the hardware was before a new property to
control it is introduced. A side-effect of such defaults being
available is that they could be used instead of the blind save, to do a
blind restore of the properties a KMS program does not understand.

> > > > TL;DR:
> > > >
> > > > If blind save & restore (but restricted only to cases where KMS state
> > > > may have been lost!) works, it will ensure that a running userspace
> > > > program has what it was started with, but it does not guarantee that
> > > > the state is the same between restarts of the program.

This was about switching.

> > > >
> > > > It seems that the default state is always undefined, and cannot be
> > > > defined because whatever one defines might not always result in a
> > > > picture on screen. Bugs in defined default state cannot be fixed, by
> > > > definition as the default state is immutable once defined.
> > > >
> > > > Ensuring consistent state between restarts is the responsibility of

Here I started talking about "restarts".

> > > > userspace and each KMS program individually. They might attempt to
> > > > blindly save KMS state in a file to be re-used on the next start.  
> > >
> > > I disagree with "each KMS program individually". We already have defined
> > > protocols for drm master switching: Either the cooperative vt-switch
> > > dance, or the unpriviledge switch logind allows. We also have ad-hoc
> > > definitions of smooth transitions, it's "one primary plane, untiled". Some
> > > people broke that, and now we have getfb2 so you can smoothly transition
> > > even with tiled buffers. Except not everyone supports that.  
> >
> > Switching is not the point with saving unrecognized KMS state into a
> > compositor-specific file. The point of the file is to re-create the
> > exact same KMS state after the compositor has been restarted,
> > regardless of what KMS program might have ran first.  
> 
> This example isn't about when you can capture the "default state",
> saving on every vt switch doesn't work. This was another example to
> illustrate that smooth, glitch-free vt switching without coordination
> between compositors (and enforced by logind) doesn't work. Some
> compositors leave a tiled buffer behind, if the new compositor doesn't
> implement getfb2 then you get ugly artifacts when trying to blend to
> the new screen.

Right, about switching. The blind save & restore solves switching
for the the unrecognized KMS properties. I was not seeking to ensure
that switching results in smooth transitions, only that smooth
transitions are not deliberately made impossible by e.g. logind always
resetting KMS state on switch.

So I consider switching solved, the only open question is about
restarts.

> > Use case: when you have completed monitor color profiling, the profile
> > may become invalid as soon as anything affecting colors changes, e.g.
> > link bit depth. With all the new properties for color spaces, HDR etc.
> > the likelihood of an unrecognized property affecting colors is quite
> > high. Either the property must be reset to the value it had during
> > profiling, or the profile needs to be rejected. This is more about
> > rebooting the machine than it is about switching between display
> > servers.
> >
> > If the compositor has such a file saved, it might as well use it also
> > for sanitation when switching from another display server.
> >
> > This still does not handle the case where a kernel upgrade introduces a
> > new property, the graphical login manager sets the property altering
> > colors, and the session compositor you profiled does not recognize the
> > new property and also does not have it saved in the file because it
> > didn't exist last time. But maybe that's a corner-case enough, and
> > people who care will notice the profile is invalid when physically
> > measuring its validity before starting to work. But this could be seen
> > as a kernel regression too: you only had to upgrade the kernel. Logind
> > saving state before fbcon takes over could remedy this, if the session
> > compositor integrated the logind saved state on every start.
> >
> > But again we would rely on kernel not changing the hardware state
> > between kernel versions before fbcon takes over.  
> 
> I guess my point is, this is a lot bigger problem than just the
> default state. This = vt switching that doesn't look horrible and
> doesn't result in artifacts and issues on the new compositor.

I am interested in getting reliably to the same hardware state as you
used to have before, either after reboots or after vt-switches. The
transition does not have to be guaranteed to be smooth, but at the same
time the restore policy should not exclude the possibility of a
smooth transition.


Thanks,
pq

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-05-04 12:22                               ` Pekka Paalanen
@ 2020-05-05  8:48                                 ` Daniel Vetter
  2020-05-07  9:03                                   ` Pekka Paalanen
  0 siblings, 1 reply; 54+ messages in thread
From: Daniel Vetter @ 2020-05-05  8:48 UTC (permalink / raw)
  To: Pekka Paalanen
  Cc: Thomas Zimmermann, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Nitin Joshi1,
	Rajat Jain, Mark Pearson, Benjamin Berg

Refocusing on where I think we still have a bit a disconnnect.

On Mon, May 04, 2020 at 03:22:28PM +0300, Pekka Paalanen wrote:
> On Mon, 4 May 2020 13:00:02 +0200
> Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Mon, May 4, 2020 at 11:49 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> > > On Thu, 30 Apr 2020 15:53:23 +0200
> > > Daniel Vetter <daniel@ffwll.ch> wrote:
> > I guess my point is, this is a lot bigger problem than just the
> > default state. This = vt switching that doesn't look horrible and
> > doesn't result in artifacts and issues on the new compositor.
> 
> I am interested in getting reliably to the same hardware state as you
> used to have before, either after reboots or after vt-switches. The
> transition does not have to be guaranteed to be smooth, but at the same
> time the restore policy should not exclude the possibility of a
> smooth transition.

So my point is kinda that if you want both, then the only way to get there
to have a very clear set of what's allowed to be left behind be the
outgoing compositor. For example:
- only primary plane
- only untiled
- no color transform magic
- ...

Imo this is a related problem to the save/restore topic, since if one
compositor does something the new one doesn't understand (e.g. tiled
buffers with modifiers, and new compositor doesn't use getfb2), then it's
going to break.

Similar, if the old compositor sets a new color transform property that
the new compositor doesn't understand, then you get a mess.

Blind restore handles the permanent mess issue, but it doesn't handle the
smooth transition issue. But both problems are of the "old compositor did
something the new compositor doesn't understand", hence why I chuck them
into the same bin. And the issue with a blind save/restore, or kernel
defaults, or any of the solutions proposed here is that they pretty much
guarantee non-smooth transitions, they only solve the permanent damange
part of the problem.

I think to solve both, we need some kind of proper compositor switching
protocol, e.g. using logind:
- old compositor transitions to the minimal config as specified somewhere
- logind forces all other properties to "defaults", whether that's
  restoring boot-up defaults or defaults obtained from the kernel or
  something else is tbd
- logind maybe even does a transition if there's multiple version of the
  protocol, e.g. v2 allows modifiers, v1 only untiled, so it'd need to do
  a blit to untiled if the new compositor only supports v1 switching
  protocol
- new compositor takes over, and can continue the smooth transition since
  it's a well-defined starting state with limited feature usage, _and_
  everything else is reset to "defaults"
 
I fear that if we only solve the "resets to defaults" issue then we can
draw ourselves into a corner for the smooth transition stuff, if e.g. the
wrong entity in the above dance forces the reset to defaults.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support)
  2020-05-05  8:48                                 ` Daniel Vetter
@ 2020-05-07  9:03                                   ` Pekka Paalanen
  0 siblings, 0 replies; 54+ messages in thread
From: Pekka Paalanen @ 2020-05-07  9:03 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Benjamin Berg, David Airlie, Christian Kellner, dri-devel,
	Javier Martinez Canillas, Hans de Goede, Thomas Zimmermann,
	Nitin Joshi1, Rajat Jain, Mark Pearson


[-- Attachment #1.1: Type: text/plain, Size: 5438 bytes --]

On Tue, 5 May 2020 10:48:52 +0200
Daniel Vetter <daniel@ffwll.ch> wrote:

> Refocusing on where I think we still have a bit a disconnnect.
> 
> On Mon, May 04, 2020 at 03:22:28PM +0300, Pekka Paalanen wrote:
> > On Mon, 4 May 2020 13:00:02 +0200
> > Daniel Vetter <daniel@ffwll.ch> wrote:  
> > > On Mon, May 4, 2020 at 11:49 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:  
> > > > On Thu, 30 Apr 2020 15:53:23 +0200
> > > > Daniel Vetter <daniel@ffwll.ch> wrote:  
> > > I guess my point is, this is a lot bigger problem than just the
> > > default state. This = vt switching that doesn't look horrible and
> > > doesn't result in artifacts and issues on the new compositor.  
> > 
> > I am interested in getting reliably to the same hardware state as you
> > used to have before, either after reboots or after vt-switches. The
> > transition does not have to be guaranteed to be smooth, but at the same
> > time the restore policy should not exclude the possibility of a
> > smooth transition.  
> 
> So my point is kinda that if you want both, then the only way to get there
> to have a very clear set of what's allowed to be left behind be the
> outgoing compositor. For example:
> - only primary plane
> - only untiled
> - no color transform magic
> - ...

Hi,

agreed, and to this I was implicitly adding "for everything else the
old compositor must leave behind the so called default state".

That means the new compositor restoring defaults to the properties it
does not understand is a no-op, unless the old compositor violated the
smooth transition rules, in which case it might not be a smooth
transition.

If on some system, the default state does not result in a "usable"
picture on screen, then a compositor that does not understand the
property will always be broken and has always been broken.

If such compositor ever worked ok on such system, it was because it
inherited a property value that made things work. So if using defaults
breaks things, it means the defaults were not actually the defaults.

This definition probably means that defaults cannot be universal
constants, but I think it would be enough if they were constants given
the hardware and firmware at hand. IOW, not changed by simply rebooting
or upgrading the kernel.

> 
> Imo this is a related problem to the save/restore topic, since if one
> compositor does something the new one doesn't understand (e.g. tiled
> buffers with modifiers, and new compositor doesn't use getfb2), then it's
> going to break.
> 
> Similar, if the old compositor sets a new color transform property that
> the new compositor doesn't understand, then you get a mess.
> 
> Blind restore handles the permanent mess issue, but it doesn't handle the
> smooth transition issue. But both problems are of the "old compositor did
> something the new compositor doesn't understand", hence why I chuck them
> into the same bin. And the issue with a blind save/restore, or kernel
> defaults, or any of the solutions proposed here is that they pretty much
> guarantee non-smooth transitions, they only solve the permanent damange
> part of the problem.

Right, except I disagree on the "guarantee non-smooth transition", as I
explained above.

> I think to solve both, we need some kind of proper compositor switching
> protocol, e.g. using logind:
> - old compositor transitions to the minimal config as specified somewhere
> - logind forces all other properties to "defaults", whether that's
>   restoring boot-up defaults or defaults obtained from the kernel or
>   something else is tbd

In my mind, the new compositor would do this step. If the old
compositor did its job, it's a no-op.

> - logind maybe even does a transition if there's multiple version of the
>   protocol, e.g. v2 allows modifiers, v1 only untiled, so it'd need to do
>   a blit to untiled if the new compositor only supports v1 switching
>   protocol

That would be fancy, but I'm not sure it's necessary. Maybe it becomes
useful one day, and then that day we can look into involving actual
protocol or logind. I just don't see a problem implementing the de
facto protocol v1 that already exists without logind or such.

> - new compositor takes over, and can continue the smooth transition since
>   it's a well-defined starting state with limited feature usage, _and_
>   everything else is reset to "defaults"
>  
> I fear that if we only solve the "resets to defaults" issue then we can
> draw ourselves into a corner for the smooth transition stuff, if e.g. the
> wrong entity in the above dance forces the reset to defaults.

I guess we fundamentally agree. It just doesn't matter who does the
reset to defaults if the old compositor does not, the transition cannot
be smooth.

To me it seems the key here is that the old compositor *must* use "a
simple KMS setup" and reset everything else to defaults.

Actually, that's what we require in practise already. If the old
compositor leaves something funny in KMS state, the new compositor will
either glitch or be permanently broken when taking over.

The only thing that needs more work to fix is to prevent the permanent
breakage.

I guess this line on thinking also shows the answer to what KMS programs
are supposed to do on switching out: clean up after yourself, but don't
shut CRTCs down or change video modes.


Thanks,
pq

[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2020-05-07  9:03 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-15  9:42 RFC: Drm-connector properties managed by another driver / privacy screen support Hans de Goede
2020-04-15  9:52 ` Daniel Vetter
2020-04-15 10:11   ` Hans de Goede
2020-04-15 10:22     ` Daniel Vetter
2020-04-15 11:39       ` Hans de Goede
2020-04-15 11:56         ` Hans de Goede
2020-04-15 12:01         ` Daniel Vetter
2020-04-15 13:02           ` Hans de Goede
2020-04-15 17:54             ` Daniel Vetter
2020-04-15 18:19               ` Hans de Goede
2020-04-15 18:29                 ` Daniel Vetter
2020-04-15 19:50                   ` Hans de Goede
2020-04-16  6:46                     ` Daniel Vetter
2020-04-15 15:28 ` Jani Nikula
2020-04-15 15:40   ` Hans de Goede
2020-04-15 17:14     ` [External] " Mark Pearson
2020-04-15 18:06       ` Hans de Goede
2020-04-15 19:20     ` Rajat Jain
2020-04-15 21:10       ` Jani Nikula
2020-04-15 21:21         ` Hans de Goede
2020-04-15 21:51           ` [External] " Mark Pearson
2020-04-17  9:05         ` Pekka Paalanen
2020-04-17  9:02     ` Pekka Paalanen
2020-04-17 11:55       ` Jani Nikula
2020-04-17 14:18         ` Daniel Vetter
2020-04-17 14:54           ` Benjamin Berg
2020-04-21 12:37         ` Hans de Goede
2020-04-21 12:40           ` Daniel Vetter
2020-04-21 14:46           ` Pekka Paalanen
2020-04-23 18:21             ` Rajat Jain
2020-04-24  7:40               ` Pekka Paalanen
2020-04-24  8:24                 ` Hans de Goede
2020-04-24  9:08                   ` Pekka Paalanen
2020-04-24 10:32                     ` Hans de Goede
2020-04-17 14:17       ` Daniel Vetter
2020-04-20  8:27         ` Operating KMS UAPI (Re: RFC: Drm-connector properties managed by another driver / privacy screen support) Pekka Paalanen
2020-04-20 10:04           ` Pekka Paalanen
2020-04-20 10:18             ` Simon Ser
2020-04-21 12:15             ` Daniel Vetter
2020-04-21 14:33               ` Pekka Paalanen
2020-04-21 14:39                 ` Simon Ser
2020-04-23 15:01                 ` Daniel Vetter
2020-04-24  8:32                   ` Pekka Paalanen
2020-04-28 14:51                     ` Daniel Vetter
2020-04-29 10:07                       ` Pekka Paalanen
2020-04-30 13:53                         ` Daniel Vetter
2020-05-04  9:49                           ` Pekka Paalanen
2020-05-04 11:00                             ` Daniel Vetter
2020-05-04 12:22                               ` Pekka Paalanen
2020-05-05  8:48                                 ` Daniel Vetter
2020-05-07  9:03                                   ` Pekka Paalanen
2020-04-20 10:15           ` Simon Ser
2020-04-20 12:22             ` Pekka Paalanen
2020-04-20 12:33               ` Simon Ser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.