All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
@ 2022-03-16  1:25 Daniel Dadap
  2022-03-16  2:50 ` Barnabás Pőcze
  2022-03-16 16:09 ` [PATCH] " Hans de Goede
  0 siblings, 2 replies; 31+ messages in thread
From: Daniel Dadap @ 2022-03-16  1:25 UTC (permalink / raw)
  To: platform-driver-x86; +Cc: Daniel Dadap, Alexandru Dinu

Some notebook systems with EC-driven backlight control appear to have a
firmware bug which causes the system to use GPU-driven backlight control
upon a fresh boot, but then switches to EC-driven backlight control
after completing a suspend/resume cycle. All the while, the firmware
reports that the backlight is under EC control, regardless of what is
actually controlling the backlight brightness.

This leads to the following behavior:

* nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
  WMI-wrapped ACPI method erroneously reporting EC control.
* nvidia-wmi-ec-backlight does not work until after a suspend/resume
  cycle, due to the backlight control actually being GPU-driven.
* GPU drivers also register their own backlight handlers: in the case
  of the notebook system where this behavior has been observed, both
  amdgpu and the NVIDIA proprietary driver register backlight handlers.
* The GPU which has backlight control upon a fresh boot (amdgpu in the
  case observed so far) can successfully control the backlight through
  its backlight driver's sysfs interface, but stops working after the
  first suspend/resume cycle.
* nvidia-wmi-ec-backlight is unable to control the backlight upon a
  fresh boot, but begins to work after the first suspend/resume cycle.
* The GPU which does not have backlight control (NVIDIA in this case)
  is not able to control the backlight at any point while the system
  is in operation. On similar hybrid systems with an EC-controlled
  backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
  does not register its backlight handler. It has not been determined
  whether the non-functional handler registered by the NVIDIA driver
  is due to another firmware bug, or a bug in the NVIDIA driver.

Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
device, it takes precedence over the BACKLIGHT_RAW devices registered
by the GPU drivers. This in turn leads to backlight control appearing
to be non-functional until after completing a suspend/resume cycle.
However, it is still possible to control the backlight through direct
interaction with the working GPU driver's backlight sysfs interface.

These systems also appear to have a second firmware bug which resets
the EC's brightness level to 100% on resume, but leaves the state in
the kernel at the pre-suspend level. This causes attempts to save
and restore the backlight level across the suspend/resume cycle to
fail, due to the level appearing not to change even though it did.

In order to work around these issue, add quirk tables to detect
systems that are known to show these behaviors. So far, there is
only one known system that requires these workarounds, and both
issues are present on that system, but the quirks are tracked in
separate tables to make it easier to add them to other systems which
may exhibit one of the bugs, but not the other. The original systems
that this driver was tested on during development do not exhibit
either of these quirks.

If a system with the "GPU driver has backlight control" quirk is
detected, nvidia-wmi-ec-backlight will grab a reference to the working
(when freshly booted) GPU backlight handler and relays any backlight
brightness level change requests directed at the EC to also be applied
to the GPU backlight interface. This leads to redundant updates
directed at the GPU backlight driver after a suspend/resume cycle, but
it does allow the EC backlight control to work when the system is
freshly booted.

If a system with the "backlight level reset to full on resume" quirk
is detected, nvidia-wmi-ec-backlight will register a PM notifier to
reset the backlight to the previous level upon resume.

These workarounds are also plumbed through to kernel module parameters,
to make it easier for users who suspect they may be affected by one or
both of these bugs to test whether these workarounds are effective on
their systems as well.

Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
---
 .../platform/x86/nvidia-wmi-ec-backlight.c    | 181 +++++++++++++++++-
 1 file changed, 179 insertions(+), 2 deletions(-)

diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
index 61e37194df70..ccb3b506c12c 100644
--- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
+++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
@@ -3,8 +3,11 @@
  * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
  */
 
+#define pr_fmt(f) "%s: " f "\n", KBUILD_MODNAME
+
 #include <linux/acpi.h>
 #include <linux/backlight.h>
+#include <linux/dmi.h>
 #include <linux/mod_devicetable.h>
 #include <linux/module.h>
 #include <linux/types.h>
@@ -75,6 +78,69 @@ struct wmi_brightness_args {
 	u32 ignored[3];
 };
 
+/**
+ * struct nvidia_wmi_ec_backlight_priv - driver private data
+ * @bl_dev:       the associated backlight device
+ * @proxy_target: backlight device which receives relayed brightness changes
+ * @notifier:     notifier block for resume callback
+ */
+struct nvidia_wmi_ec_backlight_priv {
+	struct backlight_device *bl_dev;
+	struct backlight_device *proxy_target;
+	struct notifier_block nb;
+};
+
+static char *backlight_proxy_target;
+module_param(backlight_proxy_target, charp, 0);
+MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
+
+static int max_reprobe_attempts = 128;
+module_param(max_reprobe_attempts, int, 0);
+MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
+
+static bool restore_level_on_resume;
+module_param(restore_level_on_resume, bool, 0);
+MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
+
+static int assign_relay_quirk(const struct dmi_system_id *id)
+{
+	backlight_proxy_target = id->driver_data;
+	return true;
+}
+
+#define PROXY_QUIRK_ENTRY(vendor, product, quirk_data) { \
+	.callback = assign_relay_quirk,                  \
+	.matches = {                                     \
+		DMI_MATCH(DMI_SYS_VENDOR, vendor),       \
+		DMI_MATCH(DMI_PRODUCT_VERSION, product)  \
+	},                                               \
+	.driver_data = quirk_data                        \
+}
+
+static const struct dmi_system_id proxy_quirk_table[] = {
+	PROXY_QUIRK_ENTRY("LENOVO", "Legion S7 15ACH6", "amdgpu_bl1"),
+	{ }
+};
+
+static int assign_restore_quirk(const struct dmi_system_id *id)
+{
+	restore_level_on_resume = true;
+	return true;
+}
+
+#define RESTORE_QUIRK_ENTRY(vendor, product) {           \
+	.callback = assign_restore_quirk,                \
+	.matches = {                                     \
+		DMI_MATCH(DMI_SYS_VENDOR, vendor),       \
+		DMI_MATCH(DMI_PRODUCT_VERSION, product), \
+	}                                                \
+}
+
+static const struct dmi_system_id restore_quirk_table[] = {
+	RESTORE_QUIRK_ENTRY("LENOVO", "Legion S7 15ACH6"),
+	{ }
+};
+
 /**
  * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
  * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
@@ -119,9 +185,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
 	return 0;
 }
 
+static int scale_backlight_level(struct backlight_device *a,
+				 struct backlight_device *b)
+{
+	/* because floating point math in the kernel is annoying */
+	const int scaling_factor = 65536;
+	int level = a->props.brightness;
+	int relative_level = level * scaling_factor / a->props.max_brightness;
+
+	return relative_level * b->props.max_brightness / scaling_factor;
+}
+
 static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
 {
 	struct wmi_device *wdev = bl_get_data(bd);
+	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
+	struct backlight_device *proxy_target = priv->proxy_target;
+
+	if (proxy_target) {
+		int level = scale_backlight_level(bd, proxy_target);
+
+		if (backlight_device_set_brightness(proxy_target, level))
+			pr_warn("Failed to relay backlight update to \"%s\"",
+				backlight_proxy_target);
+	}
 
 	return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
 	                             WMI_BRIGHTNESS_MODE_SET,
@@ -147,13 +234,65 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
 	.get_brightness = nvidia_wmi_ec_backlight_get_brightness,
 };
 
+static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
+{
+
+	/*
+	 * On some systems, the EC backlight level gets reset to 100% when
+	 * resuming from suspend, but the backlight device state still reflects
+	 * the pre-suspend value. Refresh the existing state to sync the EC's
+	 * state back up with the kernel's.
+	 */
+	if (event == PM_POST_SUSPEND) {
+		struct nvidia_wmi_ec_backlight_priv *p;
+
+		p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
+		return backlight_update_status(p->bl_dev);
+	}
+
+	return 0;
+}
+
 static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
 {
+	struct backlight_device *bdev, *target = NULL;
+	struct nvidia_wmi_ec_backlight_priv *priv;
 	struct backlight_properties props = {};
-	struct backlight_device *bdev;
 	u32 source;
 	int ret;
 
+	/*
+	 * Check quirks tables to see if this system needs any of the firmware
+	 * bug workarounds.
+	 */
+
+	/* User-set quirks from the module parameters take precedence */
+	if (!backlight_proxy_target)
+		dmi_check_system(proxy_quirk_table);
+
+	dmi_check_system(restore_quirk_table);
+
+	if (backlight_proxy_target && backlight_proxy_target[0]) {
+		static int num_reprobe_attempts;
+
+		target = backlight_device_get_by_name(backlight_proxy_target);
+
+		if (!target) {
+			/*
+			 * The target backlight device might not be ready;
+			 * try again and disable backlight proxying if it
+			 * fails too many times.
+			 */
+			if (num_reprobe_attempts < max_reprobe_attempts) {
+				num_reprobe_attempts++;
+				return -EPROBE_DEFER;
+			}
+
+			pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
+				backlight_proxy_target, max_reprobe_attempts);
+		}
+	}
+
 	ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
 	                           WMI_BRIGHTNESS_MODE_GET, &source);
 	if (ret)
@@ -188,7 +327,44 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
 					      &wdev->dev, wdev,
 					      &nvidia_wmi_ec_backlight_ops,
 					      &props);
-	return PTR_ERR_OR_ZERO(bdev);
+
+	if (IS_ERR(bdev))
+		return PTR_ERR(bdev);
+
+	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+	priv->bl_dev = bdev;
+
+	dev_set_drvdata(&wdev->dev, priv);
+
+	if (target) {
+		int level = scale_backlight_level(target, bdev);
+
+		if (backlight_device_set_brightness(bdev, level))
+			pr_warn("Unable to import initial brightness level from %s.",
+				backlight_proxy_target);
+		priv->proxy_target = target;
+	}
+
+	if (restore_level_on_resume) {
+		priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
+		register_pm_notifier(&priv->nb);
+	}
+
+	return 0;
+}
+
+static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
+{
+	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
+	struct backlight_device *proxy_target = priv->proxy_target;
+
+	if (proxy_target)
+		put_device(&proxy_target->dev);
+
+	if (priv->nb.notifier_call)
+		unregister_pm_notifier(&priv->nb);
+
+	kfree(priv);
 }
 
 #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
@@ -204,6 +380,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
 		.name = "nvidia-wmi-ec-backlight",
 	},
 	.probe = nvidia_wmi_ec_backlight_probe,
+	.remove = nvidia_wmi_ec_backlight_remove,
 	.id_table = nvidia_wmi_ec_backlight_id_table,
 };
 module_wmi_driver(nvidia_wmi_ec_backlight_driver);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16  1:25 [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware Daniel Dadap
@ 2022-03-16  2:50 ` Barnabás Pőcze
  2022-03-16 15:11   ` Daniel Dadap
  2022-03-16 16:09 ` [PATCH] " Hans de Goede
  1 sibling, 1 reply; 31+ messages in thread
From: Barnabás Pőcze @ 2022-03-16  2:50 UTC (permalink / raw)
  To: Daniel Dadap; +Cc: platform-driver-x86, Alexandru Dinu

Hi


The platform-driver-x86 maintainers should've probably been
CCd. You may or may not know, but the `scripts/get_maintainers.pl`
script can be used to determine the appropriate recipients.

2022. március 16., szerda 2:25 keltezéssel, Daniel Dadap írta:
> Some notebook systems with EC-driven backlight control appear to have a
> firmware bug which causes the system to use GPU-driven backlight control
> upon a fresh boot, but then switches to EC-driven backlight control
> after completing a suspend/resume cycle. All the while, the firmware
> reports that the backlight is under EC control, regardless of what is
> actually controlling the backlight brightness.
>
> This leads to the following behavior:
>
> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>   WMI-wrapped ACPI method erroneously reporting EC control.
> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>   cycle, due to the backlight control actually being GPU-driven.
> * GPU drivers also register their own backlight handlers: in the case
>   of the notebook system where this behavior has been observed, both
>   amdgpu and the NVIDIA proprietary driver register backlight handlers.
> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>   case observed so far) can successfully control the backlight through
>   its backlight driver's sysfs interface, but stops working after the
>   first suspend/resume cycle.
> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>   fresh boot, but begins to work after the first suspend/resume cycle.
> * The GPU which does not have backlight control (NVIDIA in this case)
>   is not able to control the backlight at any point while the system
>   is in operation. On similar hybrid systems with an EC-controlled
>   backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>   does not register its backlight handler. It has not been determined
>   whether the non-functional handler registered by the NVIDIA driver
>   is due to another firmware bug, or a bug in the NVIDIA driver.
>
> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
> device, it takes precedence over the BACKLIGHT_RAW devices registered
> by the GPU drivers. This in turn leads to backlight control appearing
> to be non-functional until after completing a suspend/resume cycle.
> However, it is still possible to control the backlight through direct
> interaction with the working GPU driver's backlight sysfs interface.
>
> These systems also appear to have a second firmware bug which resets
> the EC's brightness level to 100% on resume, but leaves the state in
> the kernel at the pre-suspend level. This causes attempts to save
> and restore the backlight level across the suspend/resume cycle to
> fail, due to the level appearing not to change even though it did.
>
> In order to work around these issue, add quirk tables to detect
> systems that are known to show these behaviors. So far, there is
> only one known system that requires these workarounds, and both
> issues are present on that system, but the quirks are tracked in
> separate tables to make it easier to add them to other systems which
> may exhibit one of the bugs, but not the other. The original systems
> that this driver was tested on during development do not exhibit
> either of these quirks.
>
> If a system with the "GPU driver has backlight control" quirk is
> detected, nvidia-wmi-ec-backlight will grab a reference to the working
> (when freshly booted) GPU backlight handler and relays any backlight
> brightness level change requests directed at the EC to also be applied
> to the GPU backlight interface. This leads to redundant updates
> directed at the GPU backlight driver after a suspend/resume cycle, but
> it does allow the EC backlight control to work when the system is
> freshly booted.
>
> If a system with the "backlight level reset to full on resume" quirk
> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
> reset the backlight to the previous level upon resume.
>
> These workarounds are also plumbed through to kernel module parameters,
> to make it easier for users who suspect they may be affected by one or
> both of these bugs to test whether these workarounds are effective on
> their systems as well.
>
> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
> ---
>  .../platform/x86/nvidia-wmi-ec-backlight.c    | 181 +++++++++++++++++-
>  1 file changed, 179 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> index 61e37194df70..ccb3b506c12c 100644
> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> @@ -3,8 +3,11 @@
>   * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
>   */
>
> +#define pr_fmt(f) "%s: " f "\n", KBUILD_MODNAME

`KBUILD_MODNAME` is a string literal, so you can do e.g.

  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt


> +
>  #include <linux/acpi.h>
>  #include <linux/backlight.h>
> +#include <linux/dmi.h>
>  #include <linux/mod_devicetable.h>
>  #include <linux/module.h>
>  #include <linux/types.h>
> @@ -75,6 +78,69 @@ struct wmi_brightness_args {
>  	u32 ignored[3];
>  };
>
> +/**
> + * struct nvidia_wmi_ec_backlight_priv - driver private data
> + * @bl_dev:       the associated backlight device
> + * @proxy_target: backlight device which receives relayed brightness changes
> + * @notifier:     notifier block for resume callback
> + */
> +struct nvidia_wmi_ec_backlight_priv {
> +	struct backlight_device *bl_dev;
> +	struct backlight_device *proxy_target;
> +	struct notifier_block nb;
> +};
> +
> +static char *backlight_proxy_target;
> +module_param(backlight_proxy_target, charp, 0);

It seems these module parameters are neither readable nor writable,
is that intentional?


> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
> +
> +static int max_reprobe_attempts = 128;

Can you elaborate how this number was arrived at?


> +module_param(max_reprobe_attempts, int, 0);
> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
> +
> +static bool restore_level_on_resume;
> +module_param(restore_level_on_resume, bool, 0);
> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
> +
> +static int assign_relay_quirk(const struct dmi_system_id *id)
> +{
> +	backlight_proxy_target = id->driver_data;
> +	return true;
> +}
> +
> +#define PROXY_QUIRK_ENTRY(vendor, product, quirk_data) { \
> +	.callback = assign_relay_quirk,                  \
> +	.matches = {                                     \
> +		DMI_MATCH(DMI_SYS_VENDOR, vendor),       \
> +		DMI_MATCH(DMI_PRODUCT_VERSION, product)  \
> +	},                                               \
> +	.driver_data = quirk_data                        \
> +}
> +
> +static const struct dmi_system_id proxy_quirk_table[] = {
> +	PROXY_QUIRK_ENTRY("LENOVO", "Legion S7 15ACH6", "amdgpu_bl1"),
> +	{ }
> +};
> +
> +static int assign_restore_quirk(const struct dmi_system_id *id)
> +{
> +	restore_level_on_resume = true;
> +	return true;
> +}
> +
> +#define RESTORE_QUIRK_ENTRY(vendor, product) {           \
> +	.callback = assign_restore_quirk,                \
> +	.matches = {                                     \
> +		DMI_MATCH(DMI_SYS_VENDOR, vendor),       \
> +		DMI_MATCH(DMI_PRODUCT_VERSION, product), \
> +	}                                                \
> +}
> +
> +static const struct dmi_system_id restore_quirk_table[] = {
> +	RESTORE_QUIRK_ENTRY("LENOVO", "Legion S7 15ACH6"),
> +	{ }
> +};
> +
>  /**
>   * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
>   * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
> @@ -119,9 +185,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
>  	return 0;
>  }
>
> +static int scale_backlight_level(struct backlight_device *a,
> +				 struct backlight_device *b)
> +{
> +	/* because floating point math in the kernel is annoying */
> +	const int scaling_factor = 65536;
> +	int level = a->props.brightness;
> +	int relative_level = level * scaling_factor / a->props.max_brightness;
> +
> +	return relative_level * b->props.max_brightness / scaling_factor;
> +}

Maybe

  fixp_linear_interpolate(0, 0, a->props.max_brightness, b->props.max_brightness, a->props.brightness);

? (from `linux/fixp-arith.h`)


> +
>  static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
>  {
>  	struct wmi_device *wdev = bl_get_data(bd);
> +	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> +	struct backlight_device *proxy_target = priv->proxy_target;
> +
> +	if (proxy_target) {
> +		int level = scale_backlight_level(bd, proxy_target);
> +
> +		if (backlight_device_set_brightness(proxy_target, level))
> +			pr_warn("Failed to relay backlight update to \"%s\"",
> +				backlight_proxy_target);
> +	}
>
>  	return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
>  	                             WMI_BRIGHTNESS_MODE_SET,
> @@ -147,13 +234,65 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
>  	.get_brightness = nvidia_wmi_ec_backlight_get_brightness,
>  };
>
> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
> +{
> +
> +	/*
> +	 * On some systems, the EC backlight level gets reset to 100% when
> +	 * resuming from suspend, but the backlight device state still reflects
> +	 * the pre-suspend value. Refresh the existing state to sync the EC's
> +	 * state back up with the kernel's.
> +	 */
> +	if (event == PM_POST_SUSPEND) {
> +		struct nvidia_wmi_ec_backlight_priv *p;
> +
> +		p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
> +		return backlight_update_status(p->bl_dev);

`backlight_update_status()` returns a negative errno while the notifier chain
expects something else. It would probably be better to return `NOTIFY_DONE`
in all cases. Currently a suitable error from `backlight_update_status()` will
stop the notifier chain.


> +	}
> +
> +	return 0;
> +}
> +
>  static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
>  {
> +	struct backlight_device *bdev, *target = NULL;
> +	struct nvidia_wmi_ec_backlight_priv *priv;
>  	struct backlight_properties props = {};
> -	struct backlight_device *bdev;
>  	u32 source;
>  	int ret;
>
> +	/*
> +	 * Check quirks tables to see if this system needs any of the firmware
> +	 * bug workarounds.
> +	 */
> +
> +	/* User-set quirks from the module parameters take precedence */
> +	if (!backlight_proxy_target)
> +		dmi_check_system(proxy_quirk_table);
> +
> +	dmi_check_system(restore_quirk_table);
> +
> +	if (backlight_proxy_target && backlight_proxy_target[0]) {
> +		static int num_reprobe_attempts;
> +
> +		target = backlight_device_get_by_name(backlight_proxy_target);
> +
> +		if (!target) {
> +			/*
> +			 * The target backlight device might not be ready;
> +			 * try again and disable backlight proxying if it
> +			 * fails too many times.
> +			 */
> +			if (num_reprobe_attempts < max_reprobe_attempts) {
> +				num_reprobe_attempts++;
> +				return -EPROBE_DEFER;
> +			}
> +
> +			pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
> +				backlight_proxy_target, max_reprobe_attempts);
> +		}
> +	}

I think `target` is not put in case of error. You probably need to add something like:

  if (target) {
    ret = devm_add_action_or_reset(&wdev->dev, put_device_wrapper, target);
    if (ret < 0)
      return ret;
  }


> +
>  	ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>  	                           WMI_BRIGHTNESS_MODE_GET, &source);
>  	if (ret)
> @@ -188,7 +327,44 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
>  					      &wdev->dev, wdev,
>  					      &nvidia_wmi_ec_backlight_ops,
>  					      &props);
> -	return PTR_ERR_OR_ZERO(bdev);
> +
> +	if (IS_ERR(bdev))
> +		return PTR_ERR(bdev);
> +
> +	priv = kzalloc(sizeof(*priv), GFP_KERNEL);

`devm_kzalloc()` would probably be better and you should check if `!priv`.


> +	priv->bl_dev = bdev;
> +
> +	dev_set_drvdata(&wdev->dev, priv);
> +
> +	if (target) {
> +		int level = scale_backlight_level(target, bdev);
> +
> +		if (backlight_device_set_brightness(bdev, level))
> +			pr_warn("Unable to import initial brightness level from %s.",
> +				backlight_proxy_target);
> +		priv->proxy_target = target;
> +	}
> +
> +	if (restore_level_on_resume) {
> +		priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
> +		register_pm_notifier(&priv->nb);
> +	}
> +
> +	return 0;
> +}
> +
> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
> +{
> +	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> +	struct backlight_device *proxy_target = priv->proxy_target;
> +
> +	if (proxy_target)
> +		put_device(&proxy_target->dev);

If you switch to `devm_add_action_or_reset()`, this will not be needed.


> +
> +	if (priv->nb.notifier_call)
> +		unregister_pm_notifier(&priv->nb);
> +
> +	kfree(priv);

If you switch to `devm_kzalloc()`, this won't be needed.


>  }
>
>  #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
> @@ -204,6 +380,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
>  		.name = "nvidia-wmi-ec-backlight",
>  	},
>  	.probe = nvidia_wmi_ec_backlight_probe,
> +	.remove = nvidia_wmi_ec_backlight_remove,
>  	.id_table = nvidia_wmi_ec_backlight_id_table,
>  };
>  module_wmi_driver(nvidia_wmi_ec_backlight_driver);
> --
> 2.27.0
>

Lastly, is it expected that these bugs will be properly fixed?


Regards,
Barnabás Pőcze

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16  2:50 ` Barnabás Pőcze
@ 2022-03-16 15:11   ` Daniel Dadap
  2022-03-16 15:29     ` Limonciello, Mario
  2022-03-16 20:33     ` [PATCH v2] " Daniel Dadap
  0 siblings, 2 replies; 31+ messages in thread
From: Daniel Dadap @ 2022-03-16 15:11 UTC (permalink / raw)
  To: Barnabás Pőcze
  Cc: platform-driver-x86, Alexandru Dinu, Hans de Goede, markgross

Thanks for the feedback. I'll send out a v2 shortly: Alex, can you 
please retest when I do to make sure there aren't any regressions? None 
of these suggestions affect the core flow of how either of the 
workarounds work, so I'm not expecting any that wouldn't also reproduce 
on my EC backlight system that doesn't have either of these problems, 
but I can send you the updated version off-list first if you prefer.

Detailed replies below:

On 3/15/22 9:50 PM, Barnabás Pőcze wrote:
> Hi
>
>
> The platform-driver-x86 maintainers should've probably been
> CCd. You may or may not know, but the `scripts/get_maintainers.pl`
> script can be used to determine the appropriate recipients.


Indeed. I've copied the pdx86 maintainers on this message and will for 
future correspondence regarding this patch.


> 2022. március 16., szerda 2:25 keltezéssel, Daniel Dadap írta:
>> Some notebook systems with EC-driven backlight control appear to have a
>> firmware bug which causes the system to use GPU-driven backlight control
>> upon a fresh boot, but then switches to EC-driven backlight control
>> after completing a suspend/resume cycle. All the while, the firmware
>> reports that the backlight is under EC control, regardless of what is
>> actually controlling the backlight brightness.
>>
>> This leads to the following behavior:
>>
>> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>>    WMI-wrapped ACPI method erroneously reporting EC control.
>> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>>    cycle, due to the backlight control actually being GPU-driven.
>> * GPU drivers also register their own backlight handlers: in the case
>>    of the notebook system where this behavior has been observed, both
>>    amdgpu and the NVIDIA proprietary driver register backlight handlers.
>> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>>    case observed so far) can successfully control the backlight through
>>    its backlight driver's sysfs interface, but stops working after the
>>    first suspend/resume cycle.
>> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>>    fresh boot, but begins to work after the first suspend/resume cycle.
>> * The GPU which does not have backlight control (NVIDIA in this case)
>>    is not able to control the backlight at any point while the system
>>    is in operation. On similar hybrid systems with an EC-controlled
>>    backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>>    does not register its backlight handler. It has not been determined
>>    whether the non-functional handler registered by the NVIDIA driver
>>    is due to another firmware bug, or a bug in the NVIDIA driver.
>>
>> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
>> device, it takes precedence over the BACKLIGHT_RAW devices registered
>> by the GPU drivers. This in turn leads to backlight control appearing
>> to be non-functional until after completing a suspend/resume cycle.
>> However, it is still possible to control the backlight through direct
>> interaction with the working GPU driver's backlight sysfs interface.
>>
>> These systems also appear to have a second firmware bug which resets
>> the EC's brightness level to 100% on resume, but leaves the state in
>> the kernel at the pre-suspend level. This causes attempts to save
>> and restore the backlight level across the suspend/resume cycle to
>> fail, due to the level appearing not to change even though it did.
>>
>> In order to work around these issue, add quirk tables to detect
>> systems that are known to show these behaviors. So far, there is
>> only one known system that requires these workarounds, and both
>> issues are present on that system, but the quirks are tracked in
>> separate tables to make it easier to add them to other systems which
>> may exhibit one of the bugs, but not the other. The original systems
>> that this driver was tested on during development do not exhibit
>> either of these quirks.
>>
>> If a system with the "GPU driver has backlight control" quirk is
>> detected, nvidia-wmi-ec-backlight will grab a reference to the working
>> (when freshly booted) GPU backlight handler and relays any backlight
>> brightness level change requests directed at the EC to also be applied
>> to the GPU backlight interface. This leads to redundant updates
>> directed at the GPU backlight driver after a suspend/resume cycle, but
>> it does allow the EC backlight control to work when the system is
>> freshly booted.
>>
>> If a system with the "backlight level reset to full on resume" quirk
>> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
>> reset the backlight to the previous level upon resume.
>>
>> These workarounds are also plumbed through to kernel module parameters,
>> to make it easier for users who suspect they may be affected by one or
>> both of these bugs to test whether these workarounds are effective on
>> their systems as well.
>>
>> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
>> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
>> ---
>>   .../platform/x86/nvidia-wmi-ec-backlight.c    | 181 +++++++++++++++++-
>>   1 file changed, 179 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>> index 61e37194df70..ccb3b506c12c 100644
>> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>> @@ -3,8 +3,11 @@
>>    * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
>>    */
>>
>> +#define pr_fmt(f) "%s: " f "\n", KBUILD_MODNAME
> `KBUILD_MODNAME` is a string literal, so you can do e.g.
>
>    #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>
>
>> +
>>   #include <linux/acpi.h>
>>   #include <linux/backlight.h>
>> +#include <linux/dmi.h>
>>   #include <linux/mod_devicetable.h>
>>   #include <linux/module.h>
>>   #include <linux/types.h>
>> @@ -75,6 +78,69 @@ struct wmi_brightness_args {
>>   	u32 ignored[3];
>>   };
>>
>> +/**
>> + * struct nvidia_wmi_ec_backlight_priv - driver private data
>> + * @bl_dev:       the associated backlight device
>> + * @proxy_target: backlight device which receives relayed brightness changes
>> + * @notifier:     notifier block for resume callback
>> + */
>> +struct nvidia_wmi_ec_backlight_priv {
>> +	struct backlight_device *bl_dev;
>> +	struct backlight_device *proxy_target;
>> +	struct notifier_block nb;
>> +};
>> +
>> +static char *backlight_proxy_target;
>> +module_param(backlight_proxy_target, charp, 0);
> It seems these module parameters are neither readable nor writable,
> is that intentional?


It was intentional that they not be writable, because I didn't want to 
have to plumb everything through to handle changing the values after 
probe. However, you are right that it could still be useful to set up 
the sysfs entries to allow reading the values, as this could be useful 
information for someone who wants to check if either of these quirks are 
enabled.


>
>> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
>> +
>> +static int max_reprobe_attempts = 128;
> Can you elaborate how this number was arrived at?
>

It's just a medium-small round number. I didn't want probe to return 
-EPROBE_DEFER forever if e.g. somebody specified a wrong device name or 
if the target device name changes and the entry in the quirks table goes 
out of date. On the system I tested this on, the amdgpu_bl1 device was 
accessible on the 14th probe attempt. If there's some better value to 
plug in here, or if it's actually considered more correct to just never 
succeed at probe if the workaround is enabled but the target device can 
be found, I'd be happy to change it.


>> +module_param(max_reprobe_attempts, int, 0);
>> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
>> +
>> +static bool restore_level_on_resume;
>> +module_param(restore_level_on_resume, bool, 0);
>> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
>> +
>> +static int assign_relay_quirk(const struct dmi_system_id *id)
>> +{
>> +	backlight_proxy_target = id->driver_data;
>> +	return true;
>> +}
>> +
>> +#define PROXY_QUIRK_ENTRY(vendor, product, quirk_data) { \
>> +	.callback = assign_relay_quirk,                  \
>> +	.matches = {                                     \
>> +		DMI_MATCH(DMI_SYS_VENDOR, vendor),       \
>> +		DMI_MATCH(DMI_PRODUCT_VERSION, product)  \
>> +	},                                               \
>> +	.driver_data = quirk_data                        \
>> +}
>> +
>> +static const struct dmi_system_id proxy_quirk_table[] = {
>> +	PROXY_QUIRK_ENTRY("LENOVO", "Legion S7 15ACH6", "amdgpu_bl1"),
>> +	{ }
>> +};
>> +
>> +static int assign_restore_quirk(const struct dmi_system_id *id)
>> +{
>> +	restore_level_on_resume = true;
>> +	return true;
>> +}
>> +
>> +#define RESTORE_QUIRK_ENTRY(vendor, product) {           \
>> +	.callback = assign_restore_quirk,                \
>> +	.matches = {                                     \
>> +		DMI_MATCH(DMI_SYS_VENDOR, vendor),       \
>> +		DMI_MATCH(DMI_PRODUCT_VERSION, product), \
>> +	}                                                \
>> +}
>> +
>> +static const struct dmi_system_id restore_quirk_table[] = {
>> +	RESTORE_QUIRK_ENTRY("LENOVO", "Legion S7 15ACH6"),
>> +	{ }
>> +};
>> +
>>   /**
>>    * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
>>    * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
>> @@ -119,9 +185,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
>>   	return 0;
>>   }
>>
>> +static int scale_backlight_level(struct backlight_device *a,
>> +				 struct backlight_device *b)
>> +{
>> +	/* because floating point math in the kernel is annoying */
>> +	const int scaling_factor = 65536;
>> +	int level = a->props.brightness;
>> +	int relative_level = level * scaling_factor / a->props.max_brightness;
>> +
>> +	return relative_level * b->props.max_brightness / scaling_factor;
>> +}
> Maybe
>
>    fixp_linear_interpolate(0, 0, a->props.max_brightness, b->props.max_brightness, a->props.brightness);
>
> ? (from `linux/fixp-arith.h`)


Yes, this is exactly what I want; thank you.


>
>> +
>>   static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
>>   {
>>   	struct wmi_device *wdev = bl_get_data(bd);
>> +	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>> +	struct backlight_device *proxy_target = priv->proxy_target;
>> +
>> +	if (proxy_target) {
>> +		int level = scale_backlight_level(bd, proxy_target);
>> +
>> +		if (backlight_device_set_brightness(proxy_target, level))
>> +			pr_warn("Failed to relay backlight update to \"%s\"",
>> +				backlight_proxy_target);
>> +	}
>>
>>   	return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
>>   	                             WMI_BRIGHTNESS_MODE_SET,
>> @@ -147,13 +234,65 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
>>   	.get_brightness = nvidia_wmi_ec_backlight_get_brightness,
>>   };
>>
>> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
>> +{
>> +
>> +	/*
>> +	 * On some systems, the EC backlight level gets reset to 100% when
>> +	 * resuming from suspend, but the backlight device state still reflects
>> +	 * the pre-suspend value. Refresh the existing state to sync the EC's
>> +	 * state back up with the kernel's.
>> +	 */
>> +	if (event == PM_POST_SUSPEND) {
>> +		struct nvidia_wmi_ec_backlight_priv *p;
>> +
>> +		p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
>> +		return backlight_update_status(p->bl_dev);
> `backlight_update_status()` returns a negative errno while the notifier chain
> expects something else. It would probably be better to return `NOTIFY_DONE`
> in all cases. Currently a suitable error from `backlight_update_status()` will
> stop the notifier chain.


Thanks for catching that: I should have paid more attention to the 
notifier callback signature.


>
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>   static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
>>   {
>> +	struct backlight_device *bdev, *target = NULL;
>> +	struct nvidia_wmi_ec_backlight_priv *priv;
>>   	struct backlight_properties props = {};
>> -	struct backlight_device *bdev;
>>   	u32 source;
>>   	int ret;
>>
>> +	/*
>> +	 * Check quirks tables to see if this system needs any of the firmware
>> +	 * bug workarounds.
>> +	 */
>> +
>> +	/* User-set quirks from the module parameters take precedence */
>> +	if (!backlight_proxy_target)
>> +		dmi_check_system(proxy_quirk_table);
>> +
>> +	dmi_check_system(restore_quirk_table);
>> +
>> +	if (backlight_proxy_target && backlight_proxy_target[0]) {
>> +		static int num_reprobe_attempts;
>> +
>> +		target = backlight_device_get_by_name(backlight_proxy_target);
>> +
>> +		if (!target) {
>> +			/*
>> +			 * The target backlight device might not be ready;
>> +			 * try again and disable backlight proxying if it
>> +			 * fails too many times.
>> +			 */
>> +			if (num_reprobe_attempts < max_reprobe_attempts) {
>> +				num_reprobe_attempts++;
>> +				return -EPROBE_DEFER;
>> +			}
>> +
>> +			pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
>> +				backlight_proxy_target, max_reprobe_attempts);
>> +		}
>> +	}
> I think `target` is not put in case of error. You probably need to add something like:
>
>    if (target) {
>      ret = devm_add_action_or_reset(&wdev->dev, put_device_wrapper, target);
>      if (ret < 0)
>        return ret;
>    }
>
>
>> +
>>   	ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>>   	                           WMI_BRIGHTNESS_MODE_GET, &source);
>>   	if (ret)
>> @@ -188,7 +327,44 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
>>   					      &wdev->dev, wdev,
>>   					      &nvidia_wmi_ec_backlight_ops,
>>   					      &props);
>> -	return PTR_ERR_OR_ZERO(bdev);
>> +
>> +	if (IS_ERR(bdev))
>> +		return PTR_ERR(bdev);
>> +
>> +	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> `devm_kzalloc()` would probably be better and you should check if `!priv`.
>
>
>> +	priv->bl_dev = bdev;
>> +
>> +	dev_set_drvdata(&wdev->dev, priv);
>> +
>> +	if (target) {
>> +		int level = scale_backlight_level(target, bdev);
>> +
>> +		if (backlight_device_set_brightness(bdev, level))
>> +			pr_warn("Unable to import initial brightness level from %s.",
>> +				backlight_proxy_target);
>> +		priv->proxy_target = target;
>> +	}
>> +
>> +	if (restore_level_on_resume) {
>> +		priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
>> +		register_pm_notifier(&priv->nb);
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
>> +{
>> +	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>> +	struct backlight_device *proxy_target = priv->proxy_target;
>> +
>> +	if (proxy_target)
>> +		put_device(&proxy_target->dev);
> If you switch to `devm_add_action_or_reset()`, this will not be needed.
>
>
>> +
>> +	if (priv->nb.notifier_call)
>> +		unregister_pm_notifier(&priv->nb);
>> +
>> +	kfree(priv);
> If you switch to `devm_kzalloc()`, this won't be needed.


Thank you, the devm_*() variants are indeed useful.


>
>>   }
>>
>>   #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
>> @@ -204,6 +380,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
>>   		.name = "nvidia-wmi-ec-backlight",
>>   	},
>>   	.probe = nvidia_wmi_ec_backlight_probe,
>> +	.remove = nvidia_wmi_ec_backlight_remove,
>>   	.id_table = nvidia_wmi_ec_backlight_id_table,
>>   };
>>   module_wmi_driver(nvidia_wmi_ec_backlight_driver);
>> --
>> 2.27.0
>>
> Lastly, is it expected that these bugs will be properly fixed?


Possibly, but I wouldn't hold out hope for it for an issue at this scale 
on an already shipping system.


>
> Regards,
> Barnabás Pőcze

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16 15:11   ` Daniel Dadap
@ 2022-03-16 15:29     ` Limonciello, Mario
  2022-03-16 17:08       ` Daniel Dadap
  2022-03-16 20:33     ` [PATCH v2] " Daniel Dadap
  1 sibling, 1 reply; 31+ messages in thread
From: Limonciello, Mario @ 2022-03-16 15:29 UTC (permalink / raw)
  To: Daniel Dadap, Barnabás Pőcze
  Cc: platform-driver-x86, Alexandru Dinu, Hans de Goede, markgross,
	Deucher, Alexander

[Public]

+ Alex D

Alex, just FYI this was something that came to an AMD bug tracker and wanted you to be aware there are W/A going into nvidia-wmi-ec-backlight for some firmware problems with the mux.
IIRC that was the original suspicion too on the bug reports.

Comments inline as well.

> -----Original Message-----
> From: Daniel Dadap <ddadap@nvidia.com>
> Sent: Wednesday, March 16, 2022 10:11
> To: Barnabás Pőcze <pobrn@protonmail.com>
> Cc: platform-driver-x86@vger.kernel.org; Alexandru Dinu
> <alex.dinu07@gmail.com>; Hans de Goede <hdegoede@redhat.com>;
> markgross@kernel.org
> Subject: Re: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for
> confused firmware
> 
> Thanks for the feedback. I'll send out a v2 shortly: Alex, can you
> please retest when I do to make sure there aren't any regressions? None
> of these suggestions affect the core flow of how either of the
> workarounds work, so I'm not expecting any that wouldn't also reproduce
> on my EC backlight system that doesn't have either of these problems,
> but I can send you the updated version off-list first if you prefer.
> 
> Detailed replies below:
> 
> On 3/15/22 9:50 PM, Barnabás Pőcze wrote:
> > Hi
> >
> >
> > The platform-driver-x86 maintainers should've probably been
> > CCd. You may or may not know, but the `scripts/get_maintainers.pl`
> > script can be used to determine the appropriate recipients.
> 
> 
> Indeed. I've copied the pdx86 maintainers on this message and will for
> future correspondence regarding this patch.
> 
> 
> > 2022. március 16., szerda 2:25 keltezéssel, Daniel Dadap írta:
> >> Some notebook systems with EC-driven backlight control appear to have a
> >> firmware bug which causes the system to use GPU-driven backlight
> control
> >> upon a fresh boot, but then switches to EC-driven backlight control
> >> after completing a suspend/resume cycle. All the while, the firmware
> >> reports that the backlight is under EC control, regardless of what is
> >> actually controlling the backlight brightness.
> >>
> >> This leads to the following behavior:
> >>
> >> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
> >>    WMI-wrapped ACPI method erroneously reporting EC control.
> >> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
> >>    cycle, due to the backlight control actually being GPU-driven.
> >> * GPU drivers also register their own backlight handlers: in the case
> >>    of the notebook system where this behavior has been observed, both
> >>    amdgpu and the NVIDIA proprietary driver register backlight handlers.
> >> * The GPU which has backlight control upon a fresh boot (amdgpu in the
> >>    case observed so far) can successfully control the backlight through
> >>    its backlight driver's sysfs interface, but stops working after the
> >>    first suspend/resume cycle.
> >> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
> >>    fresh boot, but begins to work after the first suspend/resume cycle.
> >> * The GPU which does not have backlight control (NVIDIA in this case)
> >>    is not able to control the backlight at any point while the system
> >>    is in operation. On similar hybrid systems with an EC-controlled
> >>    backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
> >>    does not register its backlight handler. It has not been determined
> >>    whether the non-functional handler registered by the NVIDIA driver
> >>    is due to another firmware bug, or a bug in the NVIDIA driver.
> >>
> >> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
> >> device, it takes precedence over the BACKLIGHT_RAW devices registered
> >> by the GPU drivers. This in turn leads to backlight control appearing
> >> to be non-functional until after completing a suspend/resume cycle.
> >> However, it is still possible to control the backlight through direct
> >> interaction with the working GPU driver's backlight sysfs interface.
> >>
> >> These systems also appear to have a second firmware bug which resets
> >> the EC's brightness level to 100% on resume, but leaves the state in
> >> the kernel at the pre-suspend level. This causes attempts to save
> >> and restore the backlight level across the suspend/resume cycle to
> >> fail, due to the level appearing not to change even though it did.
> >>
> >> In order to work around these issue, add quirk tables to detect
> >> systems that are known to show these behaviors. So far, there is
> >> only one known system that requires these workarounds, and both
> >> issues are present on that system, but the quirks are tracked in
> >> separate tables to make it easier to add them to other systems which
> >> may exhibit one of the bugs, but not the other. The original systems
> >> that this driver was tested on during development do not exhibit
> >> either of these quirks.
> >>
> >> If a system with the "GPU driver has backlight control" quirk is
> >> detected, nvidia-wmi-ec-backlight will grab a reference to the working
> >> (when freshly booted) GPU backlight handler and relays any backlight
> >> brightness level change requests directed at the EC to also be applied
> >> to the GPU backlight interface. This leads to redundant updates
> >> directed at the GPU backlight driver after a suspend/resume cycle, but
> >> it does allow the EC backlight control to work when the system is
> >> freshly booted.
> >>
> >> If a system with the "backlight level reset to full on resume" quirk
> >> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
> >> reset the backlight to the previous level upon resume.
> >>
> >> These workarounds are also plumbed through to kernel module
> parameters,
> >> to make it easier for users who suspect they may be affected by one or
> >> both of these bugs to test whether these workarounds are effective on
> >> their systems as well.
> >>
> >> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
> >> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
> >> ---
> >>   .../platform/x86/nvidia-wmi-ec-backlight.c    | 181
> +++++++++++++++++-
> >>   1 file changed, 179 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> >> index 61e37194df70..ccb3b506c12c 100644
> >> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> >> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> >> @@ -3,8 +3,11 @@
> >>    * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
> >>    */
> >>
> >> +#define pr_fmt(f) "%s: " f "\n", KBUILD_MODNAME
> > `KBUILD_MODNAME` is a string literal, so you can do e.g.
> >
> >    #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> >
> >
> >> +
> >>   #include <linux/acpi.h>
> >>   #include <linux/backlight.h>
> >> +#include <linux/dmi.h>
> >>   #include <linux/mod_devicetable.h>
> >>   #include <linux/module.h>
> >>   #include <linux/types.h>
> >> @@ -75,6 +78,69 @@ struct wmi_brightness_args {
> >>   	u32 ignored[3];
> >>   };
> >>
> >> +/**
> >> + * struct nvidia_wmi_ec_backlight_priv - driver private data
> >> + * @bl_dev:       the associated backlight device
> >> + * @proxy_target: backlight device which receives relayed brightness
> changes
> >> + * @notifier:     notifier block for resume callback
> >> + */
> >> +struct nvidia_wmi_ec_backlight_priv {
> >> +	struct backlight_device *bl_dev;
> >> +	struct backlight_device *proxy_target;
> >> +	struct notifier_block nb;
> >> +};
> >> +
> >> +static char *backlight_proxy_target;
> >> +module_param(backlight_proxy_target, charp, 0);
> > It seems these module parameters are neither readable nor writable,
> > is that intentional?
> 
> 
> It was intentional that they not be writable, because I didn't want to
> have to plumb everything through to handle changing the values after
> probe. However, you are right that it could still be useful to set up
> the sysfs entries to allow reading the values, as this could be useful
> information for someone who wants to check if either of these quirks are
> enabled.
> 
> 
> >
> >> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness
> change requests to the named backlight driver, on systems which
> erroneously report EC backlight control.");
> >> +
> >> +static int max_reprobe_attempts = 128;
> > Can you elaborate how this number was arrived at?
> >
> 
> It's just a medium-small round number. I didn't want probe to return
> -EPROBE_DEFER forever if e.g. somebody specified a wrong device name or
> if the target device name changes and the entry in the quirks table goes
> out of date. On the system I tested this on, the amdgpu_bl1 device was
> accessible on the 14th probe attempt. If there's some better value to
> plug in here, or if it's actually considered more correct to just never
> succeed at probe if the workaround is enabled but the target device can
> be found, I'd be happy to change it.
> 
> 
> >> +module_param(max_reprobe_attempts, int, 0);
> >> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe
> attempts when relaying brightness change requests.");
> >> +
> >> +static bool restore_level_on_resume;
> >> +module_param(restore_level_on_resume, bool, 0);
> >> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the
> backlight level when resuming from suspend, on systems which reset the
> EC's backlight level on resume.");
> >> +
> >> +static int assign_relay_quirk(const struct dmi_system_id *id)
> >> +{
> >> +	backlight_proxy_target = id->driver_data;
> >> +	return true;
> >> +}
> >> +
> >> +#define PROXY_QUIRK_ENTRY(vendor, product, quirk_data) { \
> >> +	.callback = assign_relay_quirk,                  \
> >> +	.matches = {                                     \
> >> +		DMI_MATCH(DMI_SYS_VENDOR, vendor),       \
> >> +		DMI_MATCH(DMI_PRODUCT_VERSION, product)  \
> >> +	},                                               \
> >> +	.driver_data = quirk_data                        \
> >> +}
> >> +
> >> +static const struct dmi_system_id proxy_quirk_table[] = {
> >> +	PROXY_QUIRK_ENTRY("LENOVO", "Legion S7 15ACH6",
> "amdgpu_bl1"),
> >> +	{ }
> >> +};
> >> +
> >> +static int assign_restore_quirk(const struct dmi_system_id *id)
> >> +{
> >> +	restore_level_on_resume = true;
> >> +	return true;
> >> +}
> >> +
> >> +#define RESTORE_QUIRK_ENTRY(vendor, product) {           \
> >> +	.callback = assign_restore_quirk,                \
> >> +	.matches = {                                     \
> >> +		DMI_MATCH(DMI_SYS_VENDOR, vendor),       \
> >> +		DMI_MATCH(DMI_PRODUCT_VERSION, product), \
> >> +	}                                                \
> >> +}
> >> +
> >> +static const struct dmi_system_id restore_quirk_table[] = {
> >> +	RESTORE_QUIRK_ENTRY("LENOVO", "Legion S7 15ACH6"),
> >> +	{ }
> >> +};
> >> +
> >>   /**
> >>    * wmi_brightness_notify() - helper function for calling WMI-wrapped
> ACPI method
> >>    * @w:    Pointer to the struct wmi_device identified by
> %WMI_BRIGHTNESS_GUID
> >> @@ -119,9 +185,30 @@ static int wmi_brightness_notify(struct
> wmi_device *w, enum wmi_brightness_metho
> >>   	return 0;
> >>   }
> >>
> >> +static int scale_backlight_level(struct backlight_device *a,
> >> +				 struct backlight_device *b)
> >> +{
> >> +	/* because floating point math in the kernel is annoying */
> >> +	const int scaling_factor = 65536;
> >> +	int level = a->props.brightness;
> >> +	int relative_level = level * scaling_factor / a->props.max_brightness;
> >> +
> >> +	return relative_level * b->props.max_brightness / scaling_factor;
> >> +}
> > Maybe
> >
> >    fixp_linear_interpolate(0, 0, a->props.max_brightness, b-
> >props.max_brightness, a->props.brightness);
> >
> > ? (from `linux/fixp-arith.h`)
> 
> 
> Yes, this is exactly what I want; thank you.
> 
> 
> >
> >> +
> >>   static int nvidia_wmi_ec_backlight_update_status(struct
> backlight_device *bd)
> >>   {
> >>   	struct wmi_device *wdev = bl_get_data(bd);
> >> +	struct nvidia_wmi_ec_backlight_priv *priv =
> dev_get_drvdata(&wdev->dev);
> >> +	struct backlight_device *proxy_target = priv->proxy_target;
> >> +
> >> +	if (proxy_target) {
> >> +		int level = scale_backlight_level(bd, proxy_target);
> >> +
> >> +		if (backlight_device_set_brightness(proxy_target, level))
> >> +			pr_warn("Failed to relay backlight update to \"%s\"",
> >> +				backlight_proxy_target);
> >> +	}
> >>
> >>   	return wmi_brightness_notify(wdev,
> WMI_BRIGHTNESS_METHOD_LEVEL,
> >>   	                             WMI_BRIGHTNESS_MODE_SET,
> >> @@ -147,13 +234,65 @@ static const struct backlight_ops
> nvidia_wmi_ec_backlight_ops = {
> >>   	.get_brightness = nvidia_wmi_ec_backlight_get_brightness,
> >>   };
> >>
> >> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block
> *nb, unsigned long event, void *d)
> >> +{
> >> +
> >> +	/*
> >> +	 * On some systems, the EC backlight level gets reset to 100% when
> >> +	 * resuming from suspend, but the backlight device state still reflects
> >> +	 * the pre-suspend value. Refresh the existing state to sync the EC's
> >> +	 * state back up with the kernel's.
> >> +	 */
> >> +	if (event == PM_POST_SUSPEND) {
> >> +		struct nvidia_wmi_ec_backlight_priv *p;
> >> +
> >> +		p = container_of(nb, struct nvidia_wmi_ec_backlight_priv,
> nb);
> >> +		return backlight_update_status(p->bl_dev);
> > `backlight_update_status()` returns a negative errno while the notifier
> chain
> > expects something else. It would probably be better to return
> `NOTIFY_DONE`
> > in all cases. Currently a suitable error from `backlight_update_status()` will
> > stop the notifier chain.
> 
> 
> Thanks for catching that: I should have paid more attention to the
> notifier callback signature.
> 
> 
> >
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >>   static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev,
> const void *ctx)
> >>   {
> >> +	struct backlight_device *bdev, *target = NULL;
> >> +	struct nvidia_wmi_ec_backlight_priv *priv;
> >>   	struct backlight_properties props = {};
> >> -	struct backlight_device *bdev;
> >>   	u32 source;
> >>   	int ret;
> >>
> >> +	/*
> >> +	 * Check quirks tables to see if this system needs any of the firmware
> >> +	 * bug workarounds.
> >> +	 */
> >> +
> >> +	/* User-set quirks from the module parameters take precedence */
> >> +	if (!backlight_proxy_target)
> >> +		dmi_check_system(proxy_quirk_table);
> >> +
> >> +	dmi_check_system(restore_quirk_table);
> >> +
> >> +	if (backlight_proxy_target && backlight_proxy_target[0]) {
> >> +		static int num_reprobe_attempts;
> >> +
> >> +		target =
> backlight_device_get_by_name(backlight_proxy_target);
> >> +
> >> +		if (!target) {
> >> +			/*
> >> +			 * The target backlight device might not be ready;
> >> +			 * try again and disable backlight proxying if it
> >> +			 * fails too many times.
> >> +			 */
> >> +			if (num_reprobe_attempts <
> max_reprobe_attempts) {
> >> +				num_reprobe_attempts++;
> >> +				return -EPROBE_DEFER;
> >> +			}
> >> +
> >> +			pr_warn("Unable to acquire %s after %d attempts.
> Disabling backlight proxy.",
> >> +				backlight_proxy_target,
> max_reprobe_attempts);
> >> +		}
> >> +	}
> > I think `target` is not put in case of error. You probably need to add
> something like:
> >
> >    if (target) {
> >      ret = devm_add_action_or_reset(&wdev->dev, put_device_wrapper,
> target);
> >      if (ret < 0)
> >        return ret;
> >    }
> >
> >
> >> +
> >>   	ret = wmi_brightness_notify(wdev,
> WMI_BRIGHTNESS_METHOD_SOURCE,
> >>   	                           WMI_BRIGHTNESS_MODE_GET, &source);
> >>   	if (ret)
> >> @@ -188,7 +327,44 @@ static int nvidia_wmi_ec_backlight_probe(struct
> wmi_device *wdev, const void *ct
> >>   					      &wdev->dev, wdev,
> >>   					      &nvidia_wmi_ec_backlight_ops,
> >>   					      &props);
> >> -	return PTR_ERR_OR_ZERO(bdev);
> >> +
> >> +	if (IS_ERR(bdev))
> >> +		return PTR_ERR(bdev);
> >> +
> >> +	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> > `devm_kzalloc()` would probably be better and you should check if `!priv`.
> >
> >
> >> +	priv->bl_dev = bdev;
> >> +
> >> +	dev_set_drvdata(&wdev->dev, priv);
> >> +
> >> +	if (target) {
> >> +		int level = scale_backlight_level(target, bdev);
> >> +
> >> +		if (backlight_device_set_brightness(bdev, level))
> >> +			pr_warn("Unable to import initial brightness level
> from %s.",
> >> +				backlight_proxy_target);
> >> +		priv->proxy_target = target;
> >> +	}
> >> +
> >> +	if (restore_level_on_resume) {
> >> +		priv->nb.notifier_call =
> nvidia_wmi_ec_backlight_pm_notifier;
> >> +		register_pm_notifier(&priv->nb);
> >> +	}
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
> >> +{
> >> +	struct nvidia_wmi_ec_backlight_priv *priv =
> dev_get_drvdata(&wdev->dev);
> >> +	struct backlight_device *proxy_target = priv->proxy_target;
> >> +
> >> +	if (proxy_target)
> >> +		put_device(&proxy_target->dev);
> > If you switch to `devm_add_action_or_reset()`, this will not be needed.
> >
> >
> >> +
> >> +	if (priv->nb.notifier_call)
> >> +		unregister_pm_notifier(&priv->nb);
> >> +
> >> +	kfree(priv);
> > If you switch to `devm_kzalloc()`, this won't be needed.
> 
> 
> Thank you, the devm_*() variants are indeed useful.
> 
> 
> >
> >>   }
> >>
> >>   #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-
> C46177516DB7"
> >> @@ -204,6 +380,7 @@ static struct wmi_driver
> nvidia_wmi_ec_backlight_driver = {
> >>   		.name = "nvidia-wmi-ec-backlight",
> >>   	},
> >>   	.probe = nvidia_wmi_ec_backlight_probe,
> >> +	.remove = nvidia_wmi_ec_backlight_remove,
> >>   	.id_table = nvidia_wmi_ec_backlight_id_table,
> >>   };
> >>   module_wmi_driver(nvidia_wmi_ec_backlight_driver);
> >> --
> >> 2.27.0
> >>
> > Lastly, is it expected that these bugs will be properly fixed?
> 
> 
> Possibly, but I wouldn't hold out hope for it for an issue at this scale
> on an already shipping system.

This question I'm assuming was aimed at narrowing the quirk to only
match certain FW versions or so.  If there is no certainty of when/if it
will be fixed I agree with current direction.
However I think it's still worth at least noting near the quirk in a comment
what firmware version it was identified.  If later there is confirmation that
a particular firmware version had fixed it the quirk can be adjusted to be
dropped.

> 
> 
> >
> > Regards,
> > Barnabás Pőcze

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16  1:25 [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware Daniel Dadap
  2022-03-16  2:50 ` Barnabás Pőcze
@ 2022-03-16 16:09 ` Hans de Goede
  2022-03-16 17:22   ` Daniel Dadap
  1 sibling, 1 reply; 31+ messages in thread
From: Hans de Goede @ 2022-03-16 16:09 UTC (permalink / raw)
  To: Daniel Dadap, platform-driver-x86; +Cc: Alexandru Dinu

Hi,

On 3/16/22 02:25, Daniel Dadap wrote:
> Some notebook systems with EC-driven backlight control appear to have a
> firmware bug which causes the system to use GPU-driven backlight control
> upon a fresh boot, but then switches to EC-driven backlight control
> after completing a suspend/resume cycle. All the while, the firmware
> reports that the backlight is under EC control, regardless of what is
> actually controlling the backlight brightness.
> 
> This leads to the following behavior:
> 
> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>   WMI-wrapped ACPI method erroneously reporting EC control.
> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>   cycle, due to the backlight control actually being GPU-driven.
> * GPU drivers also register their own backlight handlers: in the case
>   of the notebook system where this behavior has been observed, both
>   amdgpu and the NVIDIA proprietary driver register backlight handlers.
> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>   case observed so far) can successfully control the backlight through
>   its backlight driver's sysfs interface, but stops working after the
>   first suspend/resume cycle.
> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>   fresh boot, but begins to work after the first suspend/resume cycle.
> * The GPU which does not have backlight control (NVIDIA in this case)
>   is not able to control the backlight at any point while the system
>   is in operation. On similar hybrid systems with an EC-controlled
>   backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>   does not register its backlight handler. It has not been determined
>   whether the non-functional handler registered by the NVIDIA driver
>   is due to another firmware bug, or a bug in the NVIDIA driver.
> 
> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
> device, it takes precedence over the BACKLIGHT_RAW devices registered
> by the GPU drivers. This in turn leads to backlight control appearing
> to be non-functional until after completing a suspend/resume cycle.
> However, it is still possible to control the backlight through direct
> interaction with the working GPU driver's backlight sysfs interface.
> 
> These systems also appear to have a second firmware bug which resets
> the EC's brightness level to 100% on resume, but leaves the state in
> the kernel at the pre-suspend level. This causes attempts to save
> and restore the backlight level across the suspend/resume cycle to
> fail, due to the level appearing not to change even though it did.
> 
> In order to work around these issue, add quirk tables to detect
> systems that are known to show these behaviors. So far, there is
> only one known system that requires these workarounds, and both
> issues are present on that system, but the quirks are tracked in
> separate tables to make it easier to add them to other systems which
> may exhibit one of the bugs, but not the other. The original systems
> that this driver was tested on during development do not exhibit
> either of these quirks.
> 
> If a system with the "GPU driver has backlight control" quirk is
> detected, nvidia-wmi-ec-backlight will grab a reference to the working
> (when freshly booted) GPU backlight handler and relays any backlight
> brightness level change requests directed at the EC to also be applied
> to the GPU backlight interface. This leads to redundant updates
> directed at the GPU backlight driver after a suspend/resume cycle, but
> it does allow the EC backlight control to work when the system is
> freshly booted.
> 
> If a system with the "backlight level reset to full on resume" quirk
> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
> reset the backlight to the previous level upon resume.
> 
> These workarounds are also plumbed through to kernel module parameters,
> to make it easier for users who suspect they may be affected by one or
> both of these bugs to test whether these workarounds are effective on
> their systems as well.
> 
> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
> ---
>  .../platform/x86/nvidia-wmi-ec-backlight.c    | 181 +++++++++++++++++-
>  1 file changed, 179 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> index 61e37194df70..ccb3b506c12c 100644
> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> @@ -3,8 +3,11 @@
>   * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
>   */
>  
> +#define pr_fmt(f) "%s: " f "\n", KBUILD_MODNAME
> +
>  #include <linux/acpi.h>
>  #include <linux/backlight.h>
> +#include <linux/dmi.h>
>  #include <linux/mod_devicetable.h>
>  #include <linux/module.h>
>  #include <linux/types.h>
> @@ -75,6 +78,69 @@ struct wmi_brightness_args {
>  	u32 ignored[3];
>  };
>  
> +/**
> + * struct nvidia_wmi_ec_backlight_priv - driver private data
> + * @bl_dev:       the associated backlight device
> + * @proxy_target: backlight device which receives relayed brightness changes
> + * @notifier:     notifier block for resume callback
> + */
> +struct nvidia_wmi_ec_backlight_priv {
> +	struct backlight_device *bl_dev;
> +	struct backlight_device *proxy_target;
> +	struct notifier_block nb;
> +};
> +
> +static char *backlight_proxy_target;
> +module_param(backlight_proxy_target, charp, 0);
> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
> +
> +static int max_reprobe_attempts = 128;
> +module_param(max_reprobe_attempts, int, 0);
> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
> +
> +static bool restore_level_on_resume;
> +module_param(restore_level_on_resume, bool, 0);
> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
> +
> +static int assign_relay_quirk(const struct dmi_system_id *id)
> +{
> +	backlight_proxy_target = id->driver_data;
> +	return true;
> +}
> +
> +#define PROXY_QUIRK_ENTRY(vendor, product, quirk_data) { \
> +	.callback = assign_relay_quirk,                  \
> +	.matches = {                                     \
> +		DMI_MATCH(DMI_SYS_VENDOR, vendor),       \
> +		DMI_MATCH(DMI_PRODUCT_VERSION, product)  \
> +	},                                               \
> +	.driver_data = quirk_data                        \
> +}
> +
> +static const struct dmi_system_id proxy_quirk_table[] = {
> +	PROXY_QUIRK_ENTRY("LENOVO", "Legion S7 15ACH6", "amdgpu_bl1"),
> +	{ }
> +};
> +
> +static int assign_restore_quirk(const struct dmi_system_id *id)
> +{
> +	restore_level_on_resume = true;
> +	return true;
> +}
> +
> +#define RESTORE_QUIRK_ENTRY(vendor, product) {           \
> +	.callback = assign_restore_quirk,                \
> +	.matches = {                                     \
> +		DMI_MATCH(DMI_SYS_VENDOR, vendor),       \
> +		DMI_MATCH(DMI_PRODUCT_VERSION, product), \
> +	}                                                \
> +}
> +
> +static const struct dmi_system_id restore_quirk_table[] = {
> +	RESTORE_QUIRK_ENTRY("LENOVO", "Legion S7 15ACH6"),
> +	{ }
> +};
> +
>  /**
>   * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
>   * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID

Note not a full review, just something which I noticed on a quick scan,
please use only 1 dmi_system_id table and make driver_data a bit field.

Here is some example code for that copied from another recent review:

So you would get something like this:

#define SERIO_QUIRK_RESET		BIT(0)
#define SERIO_QUIRK_NOMUX		BIT(1)
#define SERIO_QUIRK_NOPNP		BIT(2)
#define SERIO_QUIRK_NOLOOP		BIT(3)
#define SERIO_QUIRK_NOSELFTEST		BIT(4)
// etc.

static const struct dmi_system_id i8042_dmi_quirk_table[] __initconst = {
        {
                /* Entroware Proteus */
                .matches = {
                        DMI_MATCH(DMI_SYS_VENDOR, "Entroware"),
                        DMI_MATCH(DMI_PRODUCT_NAME, "Proteus"),
                        DMI_MATCH(DMI_PRODUCT_VERSION, "EL07R4"),
                },
		.driver_data = (void *)(SERIO_QUIRK_RESET | SERIO_QUIRK_NOMUX)
        },
	{}
};

I picked the Entroware EL07R4 as example here because it needs both the reset and nomux quirks.

And then when checking the quirks do:

#ifdef CONFIG_X86
	const struct dmi_system_id *dmi_id;
	long quirks = 0;

	dmi_id = dmi_first_match(i8042_dmi_quirk_table);
	if (dmi_id)
		quirks = (long)dmi_id->driver_data;

	if (i8042_reset == I8042_RESET_DEFAULT) {
		if (quirks & SERIO_QUIRK_RESET)
			i8042_reset = I8042_RESET_ALWAYS;
		if (quirks & SERIO_QUIRK_NOSELFTEST)
			i8042_reset = I8042_RESET_NEVER;
	}


This will already shrink the driver a bit by not having 2 dmi_system_id structs
for the single laptop model and this will also help to avoid getting even
more dmi_system_id tables if further quirks are necessary in the future,
basically I want to avoid ending up with something like the somewhat messy
code which is being cleaned-up here:

https://lore.kernel.org/linux-input/20220308170523.783284-2-wse@tuxedocomputers.com/

Regards,

Hans







> @@ -119,9 +185,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
>  	return 0;
>  }
>  
> +static int scale_backlight_level(struct backlight_device *a,
> +				 struct backlight_device *b)
> +{
> +	/* because floating point math in the kernel is annoying */
> +	const int scaling_factor = 65536;
> +	int level = a->props.brightness;
> +	int relative_level = level * scaling_factor / a->props.max_brightness;
> +
> +	return relative_level * b->props.max_brightness / scaling_factor;
> +}
> +
>  static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
>  {
>  	struct wmi_device *wdev = bl_get_data(bd);
> +	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> +	struct backlight_device *proxy_target = priv->proxy_target;
> +
> +	if (proxy_target) {
> +		int level = scale_backlight_level(bd, proxy_target);
> +
> +		if (backlight_device_set_brightness(proxy_target, level))
> +			pr_warn("Failed to relay backlight update to \"%s\"",
> +				backlight_proxy_target);
> +	}
>  
>  	return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
>  	                             WMI_BRIGHTNESS_MODE_SET,
> @@ -147,13 +234,65 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
>  	.get_brightness = nvidia_wmi_ec_backlight_get_brightness,
>  };
>  
> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
> +{
> +
> +	/*
> +	 * On some systems, the EC backlight level gets reset to 100% when
> +	 * resuming from suspend, but the backlight device state still reflects
> +	 * the pre-suspend value. Refresh the existing state to sync the EC's
> +	 * state back up with the kernel's.
> +	 */
> +	if (event == PM_POST_SUSPEND) {
> +		struct nvidia_wmi_ec_backlight_priv *p;
> +
> +		p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
> +		return backlight_update_status(p->bl_dev);
> +	}
> +
> +	return 0;
> +}
> +
>  static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
>  {
> +	struct backlight_device *bdev, *target = NULL;
> +	struct nvidia_wmi_ec_backlight_priv *priv;
>  	struct backlight_properties props = {};
> -	struct backlight_device *bdev;
>  	u32 source;
>  	int ret;
>  
> +	/*
> +	 * Check quirks tables to see if this system needs any of the firmware
> +	 * bug workarounds.
> +	 */
> +
> +	/* User-set quirks from the module parameters take precedence */
> +	if (!backlight_proxy_target)
> +		dmi_check_system(proxy_quirk_table);
> +
> +	dmi_check_system(restore_quirk_table);
> +
> +	if (backlight_proxy_target && backlight_proxy_target[0]) {
> +		static int num_reprobe_attempts;
> +
> +		target = backlight_device_get_by_name(backlight_proxy_target);
> +
> +		if (!target) {
> +			/*
> +			 * The target backlight device might not be ready;
> +			 * try again and disable backlight proxying if it
> +			 * fails too many times.
> +			 */
> +			if (num_reprobe_attempts < max_reprobe_attempts) {
> +				num_reprobe_attempts++;
> +				return -EPROBE_DEFER;
> +			}
> +
> +			pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
> +				backlight_proxy_target, max_reprobe_attempts);
> +		}
> +	}
> +
>  	ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>  	                           WMI_BRIGHTNESS_MODE_GET, &source);
>  	if (ret)
> @@ -188,7 +327,44 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
>  					      &wdev->dev, wdev,
>  					      &nvidia_wmi_ec_backlight_ops,
>  					      &props);
> -	return PTR_ERR_OR_ZERO(bdev);
> +
> +	if (IS_ERR(bdev))
> +		return PTR_ERR(bdev);
> +
> +	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> +	priv->bl_dev = bdev;
> +
> +	dev_set_drvdata(&wdev->dev, priv);
> +
> +	if (target) {
> +		int level = scale_backlight_level(target, bdev);
> +
> +		if (backlight_device_set_brightness(bdev, level))
> +			pr_warn("Unable to import initial brightness level from %s.",
> +				backlight_proxy_target);
> +		priv->proxy_target = target;
> +	}
> +
> +	if (restore_level_on_resume) {
> +		priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
> +		register_pm_notifier(&priv->nb);
> +	}
> +
> +	return 0;
> +}
> +
> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
> +{
> +	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> +	struct backlight_device *proxy_target = priv->proxy_target;
> +
> +	if (proxy_target)
> +		put_device(&proxy_target->dev);
> +
> +	if (priv->nb.notifier_call)
> +		unregister_pm_notifier(&priv->nb);
> +
> +	kfree(priv);
>  }
>  
>  #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
> @@ -204,6 +380,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
>  		.name = "nvidia-wmi-ec-backlight",
>  	},
>  	.probe = nvidia_wmi_ec_backlight_probe,
> +	.remove = nvidia_wmi_ec_backlight_remove,
>  	.id_table = nvidia_wmi_ec_backlight_id_table,
>  };
>  module_wmi_driver(nvidia_wmi_ec_backlight_driver);


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16 15:29     ` Limonciello, Mario
@ 2022-03-16 17:08       ` Daniel Dadap
  2022-03-16 17:21         ` Limonciello, Mario
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Dadap @ 2022-03-16 17:08 UTC (permalink / raw)
  To: Limonciello, Mario, Barnabás Pőcze
  Cc: platform-driver-x86, Alexandru Dinu, Hans de Goede, markgross,
	Deucher, Alexander

On 3/16/22 10:29 AM, Limonciello, Mario wrote:
> [Public]
>
> + Alex D
>
> Alex, just FYI this was something that came to an AMD bug tracker and wanted you to be aware there are W/A going into nvidia-wmi-ec-backlight for some firmware problems with the mux.
> IIRC that was the original suspicion too on the bug reports.


Is this on a public or private bug tracker? If this was observed on 
systems other than the one already added to these quirks, could you 
share the details of the systems so they can be added as well? (Or I 
suppose you may want to test to see if these WARs are effective on the 
affected systems as well; we can always expand the quirks table later.)


> Comments inline as well.
>
>> -----Original Message-----
>> From: Daniel Dadap <ddadap@nvidia.com>
>> Sent: Wednesday, March 16, 2022 10:11
>> To: Barnabás Pőcze <pobrn@protonmail.com>
>> Cc: platform-driver-x86@vger.kernel.org; Alexandru Dinu
>> <alex.dinu07@gmail.com>; Hans de Goede <hdegoede@redhat.com>;
>> markgross@kernel.org
>> Subject: Re: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for
>> confused firmware
>>

[ ... ]


>>
>> On 3/15/22 9:50 PM, Barnabás Pőcze wrote:
>>>   [ ... ]
>>> Lastly, is it expected that these bugs will be properly fixed?
>>
>> Possibly, but I wouldn't hold out hope for it for an issue at this scale
>> on an already shipping system.
> This question I'm assuming was aimed at narrowing the quirk to only
> match certain FW versions or so.  If there is no certainty of when/if it
> will be fixed I agree with current direction.
> However I think it's still worth at least noting near the quirk in a comment
> what firmware version it was identified.  If later there is confirmation that
> a particular firmware version had fixed it the quirk can be adjusted to be
> dropped.
>

Thanks, Mario. Sure, I'll make sure the firmware version this was first 
observed in is noted.




^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16 17:08       ` Daniel Dadap
@ 2022-03-16 17:21         ` Limonciello, Mario
  2022-03-16 17:37           ` Daniel Dadap
  0 siblings, 1 reply; 31+ messages in thread
From: Limonciello, Mario @ 2022-03-16 17:21 UTC (permalink / raw)
  To: Daniel Dadap, Barnabás Pőcze
  Cc: platform-driver-x86, Alexandru Dinu, Hans de Goede, markgross,
	Deucher, Alexander

[Public]

> On 3/16/22 10:29 AM, Limonciello, Mario wrote:
> > [Public]
> >
> > + Alex D
> >
> > Alex, just FYI this was something that came to an AMD bug tracker and
> wanted you to be aware there are W/A going into nvidia-wmi-ec-backlight
> for some firmware problems with the mux.
> > IIRC that was the original suspicion too on the bug reports.
> 
> 
> Is this on a public or private bug tracker? If this was observed on
> systems other than the one already added to these quirks, could you
> share the details of the systems so they can be added as well? (Or I
> suppose you may want to test to see if these WARs are effective on the
> affected systems as well; we can always expand the quirks table later.)

We (AMD folks) don't have the affected systems, we were just trying to help
users and things pointed at this driver, which seems to have yielded a good
investigation and conclusion!

IIRC this is the bug you want linked in the commit message:
https://gitlab.freedesktop.org/drm/amd/-/issues/1671

But these two look possible to be the same root cause:
https://gitlab.freedesktop.org/drm/amd/-/issues/1791
https://gitlab.freedesktop.org/drm/amd/-/issues/1794

If you end up introducing a module parameter to try to activate these quirks
it might be viable to ask the folks in those issues to try the v2 of your patch too
when you're ready with the module parameter.

> 
> 
> > Comments inline as well.
> >
> >> -----Original Message-----
> >> From: Daniel Dadap <ddadap@nvidia.com>
> >> Sent: Wednesday, March 16, 2022 10:11
> >> To: Barnabás Pőcze <pobrn@protonmail.com>
> >> Cc: platform-driver-x86@vger.kernel.org; Alexandru Dinu
> >> <alex.dinu07@gmail.com>; Hans de Goede <hdegoede@redhat.com>;
> >> markgross@kernel.org
> >> Subject: Re: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for
> >> confused firmware
> >>
> 
> [ ... ]
> 
> 
> >>
> >> On 3/15/22 9:50 PM, Barnabás Pőcze wrote:
> >>>   [ ... ]
> >>> Lastly, is it expected that these bugs will be properly fixed?
> >>
> >> Possibly, but I wouldn't hold out hope for it for an issue at this scale
> >> on an already shipping system.
> > This question I'm assuming was aimed at narrowing the quirk to only
> > match certain FW versions or so.  If there is no certainty of when/if it
> > will be fixed I agree with current direction.
> > However I think it's still worth at least noting near the quirk in a comment
> > what firmware version it was identified.  If later there is confirmation that
> > a particular firmware version had fixed it the quirk can be adjusted to be
> > dropped.
> >
> 
> Thanks, Mario. Sure, I'll make sure the firmware version this was first
> observed in is noted.
> 
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16 16:09 ` [PATCH] " Hans de Goede
@ 2022-03-16 17:22   ` Daniel Dadap
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel Dadap @ 2022-03-16 17:22 UTC (permalink / raw)
  To: Hans de Goede, platform-driver-x86; +Cc: Alexandru Dinu


On 3/16/22 11:09 AM, Hans de Goede wrote:
> Hi,
>
> On 3/16/22 02:25, Daniel Dadap wrote:
>> Some notebook systems with EC-driven backlight control appear to have a
>> firmware bug which causes the system to use GPU-driven backlight control
>> upon a fresh boot, but then switches to EC-driven backlight control
>> after completing a suspend/resume cycle. All the while, the firmware
>> reports that the backlight is under EC control, regardless of what is
>> actually controlling the backlight brightness.
>>
>> This leads to the following behavior:
>>
>> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>>    WMI-wrapped ACPI method erroneously reporting EC control.
>> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>>    cycle, due to the backlight control actually being GPU-driven.
>> * GPU drivers also register their own backlight handlers: in the case
>>    of the notebook system where this behavior has been observed, both
>>    amdgpu and the NVIDIA proprietary driver register backlight handlers.
>> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>>    case observed so far) can successfully control the backlight through
>>    its backlight driver's sysfs interface, but stops working after the
>>    first suspend/resume cycle.
>> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>>    fresh boot, but begins to work after the first suspend/resume cycle.
>> * The GPU which does not have backlight control (NVIDIA in this case)
>>    is not able to control the backlight at any point while the system
>>    is in operation. On similar hybrid systems with an EC-controlled
>>    backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>>    does not register its backlight handler. It has not been determined
>>    whether the non-functional handler registered by the NVIDIA driver
>>    is due to another firmware bug, or a bug in the NVIDIA driver.
>>
>> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
>> device, it takes precedence over the BACKLIGHT_RAW devices registered
>> by the GPU drivers. This in turn leads to backlight control appearing
>> to be non-functional until after completing a suspend/resume cycle.
>> However, it is still possible to control the backlight through direct
>> interaction with the working GPU driver's backlight sysfs interface.
>>
>> These systems also appear to have a second firmware bug which resets
>> the EC's brightness level to 100% on resume, but leaves the state in
>> the kernel at the pre-suspend level. This causes attempts to save
>> and restore the backlight level across the suspend/resume cycle to
>> fail, due to the level appearing not to change even though it did.
>>
>> In order to work around these issue, add quirk tables to detect
>> systems that are known to show these behaviors. So far, there is
>> only one known system that requires these workarounds, and both
>> issues are present on that system, but the quirks are tracked in
>> separate tables to make it easier to add them to other systems which
>> may exhibit one of the bugs, but not the other. The original systems
>> that this driver was tested on during development do not exhibit
>> either of these quirks.
>>
>> If a system with the "GPU driver has backlight control" quirk is
>> detected, nvidia-wmi-ec-backlight will grab a reference to the working
>> (when freshly booted) GPU backlight handler and relays any backlight
>> brightness level change requests directed at the EC to also be applied
>> to the GPU backlight interface. This leads to redundant updates
>> directed at the GPU backlight driver after a suspend/resume cycle, but
>> it does allow the EC backlight control to work when the system is
>> freshly booted.
>>
>> If a system with the "backlight level reset to full on resume" quirk
>> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
>> reset the backlight to the previous level upon resume.
>>
>> These workarounds are also plumbed through to kernel module parameters,
>> to make it easier for users who suspect they may be affected by one or
>> both of these bugs to test whether these workarounds are effective on
>> their systems as well.
>>
>> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
>> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
>> ---
>>   .../platform/x86/nvidia-wmi-ec-backlight.c    | 181 +++++++++++++++++-
>>   1 file changed, 179 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>> index 61e37194df70..ccb3b506c12c 100644
>> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>> @@ -3,8 +3,11 @@
>>    * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
>>    */
>>   
>> +#define pr_fmt(f) "%s: " f "\n", KBUILD_MODNAME
>> +
>>   #include <linux/acpi.h>
>>   #include <linux/backlight.h>
>> +#include <linux/dmi.h>
>>   #include <linux/mod_devicetable.h>
>>   #include <linux/module.h>
>>   #include <linux/types.h>
>> @@ -75,6 +78,69 @@ struct wmi_brightness_args {
>>   	u32 ignored[3];
>>   };
>>   
>> +/**
>> + * struct nvidia_wmi_ec_backlight_priv - driver private data
>> + * @bl_dev:       the associated backlight device
>> + * @proxy_target: backlight device which receives relayed brightness changes
>> + * @notifier:     notifier block for resume callback
>> + */
>> +struct nvidia_wmi_ec_backlight_priv {
>> +	struct backlight_device *bl_dev;
>> +	struct backlight_device *proxy_target;
>> +	struct notifier_block nb;
>> +};
>> +
>> +static char *backlight_proxy_target;
>> +module_param(backlight_proxy_target, charp, 0);
>> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
>> +
>> +static int max_reprobe_attempts = 128;
>> +module_param(max_reprobe_attempts, int, 0);
>> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
>> +
>> +static bool restore_level_on_resume;
>> +module_param(restore_level_on_resume, bool, 0);
>> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
>> +
>> +static int assign_relay_quirk(const struct dmi_system_id *id)
>> +{
>> +	backlight_proxy_target = id->driver_data;
>> +	return true;
>> +}
>> +
>> +#define PROXY_QUIRK_ENTRY(vendor, product, quirk_data) { \
>> +	.callback = assign_relay_quirk,                  \
>> +	.matches = {                                     \
>> +		DMI_MATCH(DMI_SYS_VENDOR, vendor),       \
>> +		DMI_MATCH(DMI_PRODUCT_VERSION, product)  \
>> +	},                                               \
>> +	.driver_data = quirk_data                        \
>> +}
>> +
>> +static const struct dmi_system_id proxy_quirk_table[] = {
>> +	PROXY_QUIRK_ENTRY("LENOVO", "Legion S7 15ACH6", "amdgpu_bl1"),
>> +	{ }
>> +};
>> +
>> +static int assign_restore_quirk(const struct dmi_system_id *id)
>> +{
>> +	restore_level_on_resume = true;
>> +	return true;
>> +}
>> +
>> +#define RESTORE_QUIRK_ENTRY(vendor, product) {           \
>> +	.callback = assign_restore_quirk,                \
>> +	.matches = {                                     \
>> +		DMI_MATCH(DMI_SYS_VENDOR, vendor),       \
>> +		DMI_MATCH(DMI_PRODUCT_VERSION, product), \
>> +	}                                                \
>> +}
>> +
>> +static const struct dmi_system_id restore_quirk_table[] = {
>> +	RESTORE_QUIRK_ENTRY("LENOVO", "Legion S7 15ACH6"),
>> +	{ }
>> +};
>> +
>>   /**
>>    * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
>>    * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
> Note not a full review, just something which I noticed on a quick scan,
> please use only 1 dmi_system_id table and make driver_data a bit field.
>
> Here is some example code for that copied from another recent review:
>
> So you would get something like this:
>
> #define SERIO_QUIRK_RESET		BIT(0)
> #define SERIO_QUIRK_NOMUX		BIT(1)
> #define SERIO_QUIRK_NOPNP		BIT(2)
> #define SERIO_QUIRK_NOLOOP		BIT(3)
> #define SERIO_QUIRK_NOSELFTEST		BIT(4)
> // etc.
>
> static const struct dmi_system_id i8042_dmi_quirk_table[] __initconst = {
>          {
>                  /* Entroware Proteus */
>                  .matches = {
>                          DMI_MATCH(DMI_SYS_VENDOR, "Entroware"),
>                          DMI_MATCH(DMI_PRODUCT_NAME, "Proteus"),
>                          DMI_MATCH(DMI_PRODUCT_VERSION, "EL07R4"),
>                  },
> 		.driver_data = (void *)(SERIO_QUIRK_RESET | SERIO_QUIRK_NOMUX)
>          },
> 	{}
> };


Thanks, yes: merging the tables would be pretty straightforward. I 
actually thought I might do a unified quirks table when we noticed the 
second quirk, but then thought it was kind of gross to cast a bit field 
to a pointer and then then back. I didn't think to check for prior art 
to see that in fact, this is exactly what other drivers do. I also was 
slightly worried about running out of bits if there are enough unique 
GPU backlight device names among other affected systems, but the 
likelihood of that happening seems remote enough that it isn't really 
worth considering.


> I picked the Entroware EL07R4 as example here because it needs both the reset and nomux quirks.
>
> And then when checking the quirks do:
>
> #ifdef CONFIG_X86
> 	const struct dmi_system_id *dmi_id;
> 	long quirks = 0;
>
> 	dmi_id = dmi_first_match(i8042_dmi_quirk_table);
> 	if (dmi_id)
> 		quirks = (long)dmi_id->driver_data;
>
> 	if (i8042_reset == I8042_RESET_DEFAULT) {
> 		if (quirks & SERIO_QUIRK_RESET)
> 			i8042_reset = I8042_RESET_ALWAYS;
> 		if (quirks & SERIO_QUIRK_NOSELFTEST)
> 			i8042_reset = I8042_RESET_NEVER;
> 	}
>
>
> This will already shrink the driver a bit by not having 2 dmi_system_id structs
> for the single laptop model and this will also help to avoid getting even
> more dmi_system_id tables if further quirks are necessary in the future,
> basically I want to avoid ending up with something like the somewhat messy
> code which is being cleaned-up here:
>
> https://lore.kernel.org/linux-input/20220308170523.783284-2-wse@tuxedocomputers.com/
>
> Regards,
>
> Hans
>
>
>
>
>
>
>
>> @@ -119,9 +185,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
>>   	return 0;
>>   }
>>   
>> +static int scale_backlight_level(struct backlight_device *a,
>> +				 struct backlight_device *b)
>> +{
>> +	/* because floating point math in the kernel is annoying */
>> +	const int scaling_factor = 65536;
>> +	int level = a->props.brightness;
>> +	int relative_level = level * scaling_factor / a->props.max_brightness;
>> +
>> +	return relative_level * b->props.max_brightness / scaling_factor;
>> +}
>> +
>>   static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
>>   {
>>   	struct wmi_device *wdev = bl_get_data(bd);
>> +	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>> +	struct backlight_device *proxy_target = priv->proxy_target;
>> +
>> +	if (proxy_target) {
>> +		int level = scale_backlight_level(bd, proxy_target);
>> +
>> +		if (backlight_device_set_brightness(proxy_target, level))
>> +			pr_warn("Failed to relay backlight update to \"%s\"",
>> +				backlight_proxy_target);
>> +	}
>>   
>>   	return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
>>   	                             WMI_BRIGHTNESS_MODE_SET,
>> @@ -147,13 +234,65 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
>>   	.get_brightness = nvidia_wmi_ec_backlight_get_brightness,
>>   };
>>   
>> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
>> +{
>> +
>> +	/*
>> +	 * On some systems, the EC backlight level gets reset to 100% when
>> +	 * resuming from suspend, but the backlight device state still reflects
>> +	 * the pre-suspend value. Refresh the existing state to sync the EC's
>> +	 * state back up with the kernel's.
>> +	 */
>> +	if (event == PM_POST_SUSPEND) {
>> +		struct nvidia_wmi_ec_backlight_priv *p;
>> +
>> +		p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
>> +		return backlight_update_status(p->bl_dev);
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>>   static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
>>   {
>> +	struct backlight_device *bdev, *target = NULL;
>> +	struct nvidia_wmi_ec_backlight_priv *priv;
>>   	struct backlight_properties props = {};
>> -	struct backlight_device *bdev;
>>   	u32 source;
>>   	int ret;
>>   
>> +	/*
>> +	 * Check quirks tables to see if this system needs any of the firmware
>> +	 * bug workarounds.
>> +	 */
>> +
>> +	/* User-set quirks from the module parameters take precedence */
>> +	if (!backlight_proxy_target)
>> +		dmi_check_system(proxy_quirk_table);
>> +
>> +	dmi_check_system(restore_quirk_table);
>> +
>> +	if (backlight_proxy_target && backlight_proxy_target[0]) {
>> +		static int num_reprobe_attempts;
>> +
>> +		target = backlight_device_get_by_name(backlight_proxy_target);
>> +
>> +		if (!target) {
>> +			/*
>> +			 * The target backlight device might not be ready;
>> +			 * try again and disable backlight proxying if it
>> +			 * fails too many times.
>> +			 */
>> +			if (num_reprobe_attempts < max_reprobe_attempts) {
>> +				num_reprobe_attempts++;
>> +				return -EPROBE_DEFER;
>> +			}
>> +
>> +			pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
>> +				backlight_proxy_target, max_reprobe_attempts);
>> +		}
>> +	}
>> +
>>   	ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>>   	                           WMI_BRIGHTNESS_MODE_GET, &source);
>>   	if (ret)
>> @@ -188,7 +327,44 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
>>   					      &wdev->dev, wdev,
>>   					      &nvidia_wmi_ec_backlight_ops,
>>   					      &props);
>> -	return PTR_ERR_OR_ZERO(bdev);
>> +
>> +	if (IS_ERR(bdev))
>> +		return PTR_ERR(bdev);
>> +
>> +	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
>> +	priv->bl_dev = bdev;
>> +
>> +	dev_set_drvdata(&wdev->dev, priv);
>> +
>> +	if (target) {
>> +		int level = scale_backlight_level(target, bdev);
>> +
>> +		if (backlight_device_set_brightness(bdev, level))
>> +			pr_warn("Unable to import initial brightness level from %s.",
>> +				backlight_proxy_target);
>> +		priv->proxy_target = target;
>> +	}
>> +
>> +	if (restore_level_on_resume) {
>> +		priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
>> +		register_pm_notifier(&priv->nb);
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
>> +{
>> +	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>> +	struct backlight_device *proxy_target = priv->proxy_target;
>> +
>> +	if (proxy_target)
>> +		put_device(&proxy_target->dev);
>> +
>> +	if (priv->nb.notifier_call)
>> +		unregister_pm_notifier(&priv->nb);
>> +
>> +	kfree(priv);
>>   }
>>   
>>   #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
>> @@ -204,6 +380,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
>>   		.name = "nvidia-wmi-ec-backlight",
>>   	},
>>   	.probe = nvidia_wmi_ec_backlight_probe,
>> +	.remove = nvidia_wmi_ec_backlight_remove,
>>   	.id_table = nvidia_wmi_ec_backlight_id_table,
>>   };
>>   module_wmi_driver(nvidia_wmi_ec_backlight_driver);

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16 17:21         ` Limonciello, Mario
@ 2022-03-16 17:37           ` Daniel Dadap
  2022-03-16 18:25             ` Limonciello, Mario
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Dadap @ 2022-03-16 17:37 UTC (permalink / raw)
  To: Limonciello, Mario, Barnabás Pőcze
  Cc: platform-driver-x86, Alexandru Dinu, Hans de Goede, markgross,
	Deucher, Alexander


On 3/16/22 12:21 PM, Limonciello, Mario wrote:
> [Public]
>
>> On 3/16/22 10:29 AM, Limonciello, Mario wrote:
>>> [Public]
>>>
>>> + Alex D
>>>
>>> Alex, just FYI this was something that came to an AMD bug tracker and
>> wanted you to be aware there are W/A going into nvidia-wmi-ec-backlight
>> for some firmware problems with the mux.
>>> IIRC that was the original suspicion too on the bug reports.
>>
>> Is this on a public or private bug tracker? If this was observed on
>> systems other than the one already added to these quirks, could you
>> share the details of the systems so they can be added as well? (Or I
>> suppose you may want to test to see if these WARs are effective on the
>> affected systems as well; we can always expand the quirks table later.)
> We (AMD folks) don't have the affected systems, we were just trying to help
> users and things pointed at this driver, which seems to have yielded a good
> investigation and conclusion!
>
> IIRC this is the bug you want linked in the commit message:
> https://gitlab.freedesktop.org/drm/amd/-/issues/1671


Ah, thanks. Most of the people on this bug seem like their problem was 
that they didn't have the nvidia-wmi-ec-backlight driver, which also 
didn't exist at the time the bug was filed. There is one person with a 
newer comment reporting behavior that sounds like what this patch works 
around, and it is the same person who initially reported the issue to me. :)


> But these two look possible to be the same root cause:
> https://gitlab.freedesktop.org/drm/amd/-/issues/1791


This one sounds like it might be a different issue, since it was 
apparently working at some point with a kernel that didn't have the EC 
backlight driver, and then not working on a newer kernel that also 
didn't have the EC backlight driver. That is, of course, assuming 
vanilla kernels: it is certainly possible that the EC backlight driver 
was backported.

> https://gitlab.freedesktop.org/drm/amd/-/issues/1794


This sounds like it could possibly be a simple case of not having the EC 
backlight driver. Notably, the backlight device exposed by the amdgpu 
driver never works, in contrast to the system these workarounds are 
targeting, where the amdgpu driver's backlight device initially works, 
but then stops working after the first suspend/resume cycle (and the EC 
backlight driver doesn't work initially, but then starts working after 
suspend/resume).


>
> If you end up introducing a module parameter to try to activate these quirks
> it might be viable to ask the folks in those issues to try the v2 of your patch too
> when you're ready with the module parameter.
>

v1 already has the quirks plumbed up to module parameters (those module 
parameters just don't have corresponding sysfs entries). In any case, I 
only see one report between those bugs that sounds like the issue these 
WARs are meant to address, and since it's from the same reporter, it 
sounds like we won't need to be adding any additional quirks table 
entries right away.


>>
>>> Comments inline as well.
>>>
>>>> -----Original Message-----
>>>> From: Daniel Dadap <ddadap@nvidia.com>
>>>> Sent: Wednesday, March 16, 2022 10:11
>>>> To: Barnabás Pőcze <pobrn@protonmail.com>
>>>> Cc: platform-driver-x86@vger.kernel.org; Alexandru Dinu
>>>> <alex.dinu07@gmail.com>; Hans de Goede <hdegoede@redhat.com>;
>>>> markgross@kernel.org
>>>> Subject: Re: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for
>>>> confused firmware
>>>>
>> [ ... ]
>>
>>
>>>> On 3/15/22 9:50 PM, Barnabás Pőcze wrote:
>>>>>    [ ... ]
>>>>> Lastly, is it expected that these bugs will be properly fixed?
>>>> Possibly, but I wouldn't hold out hope for it for an issue at this scale
>>>> on an already shipping system.
>>> This question I'm assuming was aimed at narrowing the quirk to only
>>> match certain FW versions or so.  If there is no certainty of when/if it
>>> will be fixed I agree with current direction.
>>> However I think it's still worth at least noting near the quirk in a comment
>>> what firmware version it was identified.  If later there is confirmation that
>>> a particular firmware version had fixed it the quirk can be adjusted to be
>>> dropped.
>>>
>> Thanks, Mario. Sure, I'll make sure the firmware version this was first
>> observed in is noted.
>>
>>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16 17:37           ` Daniel Dadap
@ 2022-03-16 18:25             ` Limonciello, Mario
  2022-03-16 19:23               ` Daniel Dadap
  0 siblings, 1 reply; 31+ messages in thread
From: Limonciello, Mario @ 2022-03-16 18:25 UTC (permalink / raw)
  To: Daniel Dadap, Barnabás Pőcze
  Cc: platform-driver-x86, Alexandru Dinu, Hans de Goede, markgross,
	Deucher, Alexander

[Public]

> >
> > IIRC this is the bug you want linked in the commit message:
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitla
> b.freedesktop.org%2Fdrm%2Famd%2F-
> %2Fissues%2F1671&amp;data=04%7C01%7CMario.Limonciello%40amd.com
> %7C5559a4f23f46426add1808da0773b4ac%7C3dd8961fe4884e608e11a82d994
> e183d%7C0%7C0%7C637830490785879396%7CUnknown%7CTWFpbGZsb3d8
> eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3
> D%7C3000&amp;sdata=P%2FBcLeN9rnjGam4kh68ZQUBAPIDM4G%2Bk1ukb5
> k%2BRFVg%3D&amp;reserved=0
> 
> 
> Ah, thanks. Most of the people on this bug seem like their problem was
> that they didn't have the nvidia-wmi-ec-backlight driver, which also
> didn't exist at the time the bug was filed. There is one person with a
> newer comment reporting behavior that sounds like what this patch works
> around, and it is the same person who initially reported the issue to me. :)
> 
> 

Thanks for looking at those.

> > But these two look possible to be the same root cause:
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitla
> b.freedesktop.org%2Fdrm%2Famd%2F-
> %2Fissues%2F1791&amp;data=04%7C01%7CMario.Limonciello%40amd.com
> %7C5559a4f23f46426add1808da0773b4ac%7C3dd8961fe4884e608e11a82d994
> e183d%7C0%7C0%7C637830490785879396%7CUnknown%7CTWFpbGZsb3d8
> eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3
> D%7C3000&amp;sdata=Bv3lJJOG7BZxlvizh0L4gmHgakzjlJkl7TqGh9HTho4%3D
> &amp;reserved=0
> 
> 
> This one sounds like it might be a different issue, since it was
> apparently working at some point with a kernel that didn't have the EC
> backlight driver, and then not working on a newer kernel that also
> didn't have the EC backlight driver. That is, of course, assuming
> vanilla kernels: it is certainly possible that the EC backlight driver
> was backported.
> 
> >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitla
> b.freedesktop.org%2Fdrm%2Famd%2F-
> %2Fissues%2F1794&amp;data=04%7C01%7CMario.Limonciello%40amd.com
> %7C5559a4f23f46426add1808da0773b4ac%7C3dd8961fe4884e608e11a82d994
> e183d%7C0%7C0%7C637830490785879396%7CUnknown%7CTWFpbGZsb3d8
> eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3
> D%7C3000&amp;sdata=JfUhLPRIMVLypLXoAxKhpSw7WIN4M%2BS4Y48MQ
> %2BzXdbk%3D&amp;reserved=0
> 
> 
> This sounds like it could possibly be a simple case of not having the EC
> backlight driver. Notably, the backlight device exposed by the amdgpu
> driver never works, in contrast to the system these workarounds are
> targeting, where the amdgpu driver's backlight device initially works,
> but then stops working after the first suspend/resume cycle (and the EC
> backlight driver doesn't work initially, but then starts working after
> suspend/resume).

I guess when we see backlight issues on these A+N designs the checks should be:
1) Are they supposed to be using the nvidia-wmi-ec-backlight driver?
2) Is their kernel new enough to have it?
3) Do they have the config enabled?

Do you have a script or could you perhaps include some documentation we can
point people to check "1" so we don't always have to go tear apart ACPI tables
and make guesses?

I guess it's something like grab _WDG and then parse it to see if there is an entry.

> 
> 
> >
> > If you end up introducing a module parameter to try to activate these
> quirks
> > it might be viable to ask the folks in those issues to try the v2 of your patch
> too
> > when you're ready with the module parameter.
> >
> 
> v1 already has the quirks plumbed up to module parameters (those module
> parameters just don't have corresponding sysfs entries). In any case, I
> only see one report between those bugs that sounds like the issue these
> WARs are meant to address, and since it's from the same reporter, it
> sounds like we won't need to be adding any additional quirks table
> entries right away.
> 
> 
> >>
> >>> Comments inline as well.
> >>>
> >>>> -----Original Message-----
> >>>> From: Daniel Dadap <ddadap@nvidia.com>
> >>>> Sent: Wednesday, March 16, 2022 10:11
> >>>> To: Barnabás Pőcze <pobrn@protonmail.com>
> >>>> Cc: platform-driver-x86@vger.kernel.org; Alexandru Dinu
> >>>> <alex.dinu07@gmail.com>; Hans de Goede <hdegoede@redhat.com>;
> >>>> markgross@kernel.org
> >>>> Subject: Re: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for
> >>>> confused firmware
> >>>>
> >> [ ... ]
> >>
> >>
> >>>> On 3/15/22 9:50 PM, Barnabás Pőcze wrote:
> >>>>>    [ ... ]
> >>>>> Lastly, is it expected that these bugs will be properly fixed?
> >>>> Possibly, but I wouldn't hold out hope for it for an issue at this scale
> >>>> on an already shipping system.
> >>> This question I'm assuming was aimed at narrowing the quirk to only
> >>> match certain FW versions or so.  If there is no certainty of when/if it
> >>> will be fixed I agree with current direction.
> >>> However I think it's still worth at least noting near the quirk in a
> comment
> >>> what firmware version it was identified.  If later there is confirmation
> that
> >>> a particular firmware version had fixed it the quirk can be adjusted to be
> >>> dropped.
> >>>
> >> Thanks, Mario. Sure, I'll make sure the firmware version this was first
> >> observed in is noted.
> >>
> >>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16 18:25             ` Limonciello, Mario
@ 2022-03-16 19:23               ` Daniel Dadap
  2022-03-16 19:25                 ` Limonciello, Mario
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Dadap @ 2022-03-16 19:23 UTC (permalink / raw)
  To: Limonciello, Mario, Barnabás Pőcze
  Cc: platform-driver-x86, Alexandru Dinu, Hans de Goede, markgross,
	Deucher, Alexander

On 3/16/22 13:25, Limonciello, Mario wrote:
> [Public]
>
>
> I guess when we see backlight issues on these A+N designs the checks should be:
> 1) Are they supposed to be using the nvidia-wmi-ec-backlight driver?
> 2) Is their kernel new enough to have it?
> 3) Do they have the config enabled?
>
> Do you have a script or could you perhaps include some documentation we can
> point people to check "1" so we don't always have to go tear apart ACPI tables
> and make guesses?
>
> I guess it's something like grab _WDG and then parse it to see if there is an entry.


Probably the most foolproof way would be to check for the GUID 
603E9613-EF25-4338-A3D0-C46177516DB7 in /sys/bus/wmi/devices. (2) should 
be true for vanilla 5.16 and later, and many recent pre-5.16 distro 
kernels with HWE backports.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16 19:23               ` Daniel Dadap
@ 2022-03-16 19:25                 ` Limonciello, Mario
  0 siblings, 0 replies; 31+ messages in thread
From: Limonciello, Mario @ 2022-03-16 19:25 UTC (permalink / raw)
  To: Daniel Dadap, Barnabás Pőcze
  Cc: platform-driver-x86, Alexandru Dinu, Hans de Goede, markgross,
	Deucher, Alexander

[Public]



> -----Original Message-----
> From: Daniel Dadap <ddadap@nvidia.com>
> Sent: Wednesday, March 16, 2022 14:24
> To: Limonciello, Mario <Mario.Limonciello@amd.com>; Barnabás Pőcze
> <pobrn@protonmail.com>
> Cc: platform-driver-x86@vger.kernel.org; Alexandru Dinu
> <alex.dinu07@gmail.com>; Hans de Goede <hdegoede@redhat.com>;
> markgross@kernel.org; Deucher, Alexander
> <Alexander.Deucher@amd.com>
> Subject: Re: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for
> confused firmware
> 
> On 3/16/22 13:25, Limonciello, Mario wrote:
> > [Public]
> >
> >
> > I guess when we see backlight issues on these A+N designs the checks
> should be:
> > 1) Are they supposed to be using the nvidia-wmi-ec-backlight driver?
> > 2) Is their kernel new enough to have it?
> > 3) Do they have the config enabled?
> >
> > Do you have a script or could you perhaps include some documentation we
> can
> > point people to check "1" so we don't always have to go tear apart ACPI
> tables
> > and make guesses?
> >
> > I guess it's something like grab _WDG and then parse it to see if there is an
> entry.
> 
> 
> Probably the most foolproof way would be to check for the GUID
> 603E9613-EF25-4338-A3D0-C46177516DB7 in /sys/bus/wmi/devices. (2)
> should
> be true for vanilla 5.16 and later, and many recent pre-5.16 distro
> kernels with HWE backports.

Perfect, thanks!

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16 15:11   ` Daniel Dadap
  2022-03-16 15:29     ` Limonciello, Mario
@ 2022-03-16 20:33     ` Daniel Dadap
  2022-03-16 21:28       ` Daniel Dadap
  2022-03-17 12:17       ` Hans de Goede
  1 sibling, 2 replies; 31+ messages in thread
From: Daniel Dadap @ 2022-03-16 20:33 UTC (permalink / raw)
  To: platform-driver-x86
  Cc: pobrn, hdegoede, markgross, Mario.Limonciello, Daniel Dadap,
	Alexandru Dinu

Some notebook systems with EC-driven backlight control appear to have a
firmware bug which causes the system to use GPU-driven backlight control
upon a fresh boot, but then switches to EC-driven backlight control
after completing a suspend/resume cycle. All the while, the firmware
reports that the backlight is under EC control, regardless of what is
actually controlling the backlight brightness.

This leads to the following behavior:

* nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
  WMI-wrapped ACPI method erroneously reporting EC control.
* nvidia-wmi-ec-backlight does not work until after a suspend/resume
  cycle, due to the backlight control actually being GPU-driven.
* GPU drivers also register their own backlight handlers: in the case
  of the notebook system where this behavior has been observed, both
  amdgpu and the NVIDIA proprietary driver register backlight handlers.
* The GPU which has backlight control upon a fresh boot (amdgpu in the
  case observed so far) can successfully control the backlight through
  its backlight driver's sysfs interface, but stops working after the
  first suspend/resume cycle.
* nvidia-wmi-ec-backlight is unable to control the backlight upon a
  fresh boot, but begins to work after the first suspend/resume cycle.
* The GPU which does not have backlight control (NVIDIA in this case)
  is not able to control the backlight at any point while the system
  is in operation. On similar hybrid systems with an EC-controlled
  backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
  does not register its backlight handler. It has not been determined
  whether the non-functional handler registered by the NVIDIA driver
  is due to another firmware bug, or a bug in the NVIDIA driver.

Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
device, it takes precedence over the BACKLIGHT_RAW devices registered
by the GPU drivers. This in turn leads to backlight control appearing
to be non-functional until after completing a suspend/resume cycle.
However, it is still possible to control the backlight through direct
interaction with the working GPU driver's backlight sysfs interface.

These systems also appear to have a second firmware bug which resets
the EC's brightness level to 100% on resume, but leaves the state in
the kernel at the pre-suspend level. This causes attempts to save
and restore the backlight level across the suspend/resume cycle to
fail, due to the level appearing not to change even though it did.

In order to work around these issues, add a quirk table to detect
systems that are known to show these behaviors. So far, there is
only one known system that requires these workarounds, and both
issues are present on that system, but the quirks are tracked
separately to make it easier to add them to other systems which
may exhibit one of the bugs, but not the other. The original systems
that this driver was tested on during development do not exhibit
either of these quirks.

If a system with the "GPU driver has backlight control" quirk is
detected, nvidia-wmi-ec-backlight will grab a reference to the working
(when freshly booted) GPU backlight handler and relays any backlight
brightness level change requests directed at the EC to also be applied
to the GPU backlight interface. This leads to redundant updates
directed at the GPU backlight driver after a suspend/resume cycle, but
it does allow the EC backlight control to work when the system is
freshly booted.

If a system with the "backlight level reset to full on resume" quirk
is detected, nvidia-wmi-ec-backlight will register a PM notifier to
reset the backlight to the previous level upon resume.

These workarounds are also plumbed through to kernel module parameters,
to make it easier for users who suspect they may be affected by one or
both of these bugs to test whether these workarounds are effective on
their systems as well.

Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
---
Note: the Tested-by: line above applies to the previous version of this
patch; an explicit ACK from the tester is required for it to apply to
the current version.

v2:
 * Add readable sysfs files for module params, use linear interpolation
   from fixp-arith.h, fix return value of notifier callback, use devm_*()
   for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
 * Add comment to denote known firmware versions that exhibit the bugs.
   (Mario Limonciello <Mario.Limonciello@amd.com>)
 * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)

 .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
 1 file changed, 194 insertions(+), 2 deletions(-)

diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
index 61e37194df70..95e1ddf780fc 100644
--- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
+++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
@@ -3,8 +3,12 @@
  * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
  */
 
+#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
+
 #include <linux/acpi.h>
 #include <linux/backlight.h>
+#include <linux/dmi.h>
+#include <linux/fixp-arith.h>
 #include <linux/mod_devicetable.h>
 #include <linux/module.h>
 #include <linux/types.h>
@@ -75,6 +79,73 @@ struct wmi_brightness_args {
 	u32 ignored[3];
 };
 
+/**
+ * struct nvidia_wmi_ec_backlight_priv - driver private data
+ * @bl_dev:       the associated backlight device
+ * @proxy_target: backlight device which receives relayed brightness changes
+ * @notifier:     notifier block for resume callback
+ */
+struct nvidia_wmi_ec_backlight_priv {
+	struct backlight_device *bl_dev;
+	struct backlight_device *proxy_target;
+	struct notifier_block nb;
+};
+
+static char *backlight_proxy_target;
+module_param(backlight_proxy_target, charp, 0444);
+MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
+
+static int max_reprobe_attempts = 128;
+module_param(max_reprobe_attempts, int, 0444);
+MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
+
+static bool restore_level_on_resume;
+module_param(restore_level_on_resume, bool, 0444);
+MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
+
+/* Bit field values for quirks table */
+
+#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
+
+/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
+
+#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
+
+#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
+#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
+
+static int assign_quirks(const struct dmi_system_id *id)
+{
+	if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
+		restore_level_on_resume = 1;
+
+	/* If the module parameter is set, override the quirks table */
+	if (!backlight_proxy_target) {
+		if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
+			backlight_proxy_target = "amdgpu_bl1";
+	}
+
+	return true;
+}
+
+#define QUIRK_ENTRY(vendor, product, quirks) {          \
+	.callback = assign_quirks,                      \
+	.matches = {                                    \
+		DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
+		DMI_MATCH(DMI_PRODUCT_VERSION, product) \
+	},                                              \
+	.driver_data = (void *)(quirks)                 \
+}
+
+static const struct dmi_system_id quirks_table[] = {
+	QUIRK_ENTRY(
+		/* This quirk is preset as of firmware revision HACN31WW */
+		"LENOVO", "Legion S7 15ACH6",
+		QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
+	),
+	{ }
+};
+
 /**
  * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
  * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
@@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
 	return 0;
 }
 
+/* Scale the current brightness level of 'from' to the range of 'to'. */
+static int scale_backlight_level(const struct backlight_device *from,
+				 const struct backlight_device *to)
+{
+	int from_max = from->props.max_brightness;
+	int from_level = from->props.brightness;
+	int to_max = to->props.max_brightness;
+
+	return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
+}
+
 static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
 {
 	struct wmi_device *wdev = bl_get_data(bd);
+	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
+	struct backlight_device *proxy_target = priv->proxy_target;
+
+	if (proxy_target) {
+		int level = scale_backlight_level(bd, proxy_target);
+
+		if (backlight_device_set_brightness(proxy_target, level))
+			pr_warn("Failed to relay backlight update to \"%s\"",
+				backlight_proxy_target);
+	}
 
 	return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
 	                             WMI_BRIGHTNESS_MODE_SET,
@@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
 	.get_brightness = nvidia_wmi_ec_backlight_get_brightness,
 };
 
+static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
+{
+
+	/*
+	 * On some systems, the EC backlight level gets reset to 100% when
+	 * resuming from suspend, but the backlight device state still reflects
+	 * the pre-suspend value. Refresh the existing state to sync the EC's
+	 * state back up with the kernel's.
+	 */
+	if (event == PM_POST_SUSPEND) {
+		struct nvidia_wmi_ec_backlight_priv *p;
+		int ret;
+
+		p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
+		ret = backlight_update_status(p->bl_dev);
+
+		if (ret)
+			pr_warn("failed to refresh backlight level: %d", ret);
+
+		return NOTIFY_OK;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static void putdev(void *data)
+{
+	struct device *dev = data;
+
+	put_device(dev);
+}
+
 static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
 {
+	struct backlight_device *bdev, *target = NULL;
+	struct nvidia_wmi_ec_backlight_priv *priv;
 	struct backlight_properties props = {};
-	struct backlight_device *bdev;
 	u32 source;
 	int ret;
 
+	/*
+	 * Check quirks tables to see if this system needs any of the firmware
+	 * bug workarounds.
+	 */
+	dmi_check_system(quirks_table);
+
+	if (backlight_proxy_target && backlight_proxy_target[0]) {
+		static int num_reprobe_attempts;
+
+		target = backlight_device_get_by_name(backlight_proxy_target);
+
+		if (target) {
+			ret = devm_add_action_or_reset(&wdev->dev, putdev,
+						       &target->dev);
+			if (ret)
+				return ret;
+		} else {
+			/*
+			 * The target backlight device might not be ready;
+			 * try again and disable backlight proxying if it
+			 * fails too many times.
+			 */
+			if (num_reprobe_attempts < max_reprobe_attempts) {
+				num_reprobe_attempts++;
+				return -EPROBE_DEFER;
+			}
+
+			pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
+				backlight_proxy_target, max_reprobe_attempts);
+		}
+	}
+
 	ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
 	                           WMI_BRIGHTNESS_MODE_GET, &source);
 	if (ret)
@@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
 					      &wdev->dev, wdev,
 					      &nvidia_wmi_ec_backlight_ops,
 					      &props);
-	return PTR_ERR_OR_ZERO(bdev);
+
+	if (IS_ERR(bdev))
+		return PTR_ERR(bdev);
+
+	priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	priv->bl_dev = bdev;
+
+	dev_set_drvdata(&wdev->dev, priv);
+
+	if (target) {
+		int level = scale_backlight_level(target, bdev);
+
+		if (backlight_device_set_brightness(bdev, level))
+			pr_warn("Unable to import initial brightness level from %s.",
+				backlight_proxy_target);
+		priv->proxy_target = target;
+	}
+
+	if (restore_level_on_resume) {
+		priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
+		register_pm_notifier(&priv->nb);
+	}
+
+	return 0;
+}
+
+static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
+{
+	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
+
+	if (priv->nb.notifier_call)
+		unregister_pm_notifier(&priv->nb);
 }
 
 #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
@@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
 		.name = "nvidia-wmi-ec-backlight",
 	},
 	.probe = nvidia_wmi_ec_backlight_probe,
+	.remove = nvidia_wmi_ec_backlight_remove,
 	.id_table = nvidia_wmi_ec_backlight_id_table,
 };
 module_wmi_driver(nvidia_wmi_ec_backlight_driver);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16 20:33     ` [PATCH v2] " Daniel Dadap
@ 2022-03-16 21:28       ` Daniel Dadap
  2022-03-16 22:09         ` Alexandru Dinu
  2022-03-17 12:17       ` Hans de Goede
  1 sibling, 1 reply; 31+ messages in thread
From: Daniel Dadap @ 2022-03-16 21:28 UTC (permalink / raw)
  To: platform-driver-x86
  Cc: pobrn, hdegoede, markgross, Mario.Limonciello, Alexandru Dinu

Sorry, just noticed a typo in a comment:

/* This quirk is preset as of firmware revision HACN31WW */

Obviously that is meant to read "present". I'll fix that with the next 
round of changes, assuming there will be additional review feedback.

On 3/16/22 15:33, Daniel Dadap wrote:
> Some notebook systems with EC-driven backlight control appear to have a
> firmware bug which causes the system to use GPU-driven backlight control
> upon a fresh boot, but then switches to EC-driven backlight control
> after completing a suspend/resume cycle. All the while, the firmware
> reports that the backlight is under EC control, regardless of what is
> actually controlling the backlight brightness.
>
> This leads to the following behavior:
>
> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>    WMI-wrapped ACPI method erroneously reporting EC control.
> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>    cycle, due to the backlight control actually being GPU-driven.
> * GPU drivers also register their own backlight handlers: in the case
>    of the notebook system where this behavior has been observed, both
>    amdgpu and the NVIDIA proprietary driver register backlight handlers.
> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>    case observed so far) can successfully control the backlight through
>    its backlight driver's sysfs interface, but stops working after the
>    first suspend/resume cycle.
> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>    fresh boot, but begins to work after the first suspend/resume cycle.
> * The GPU which does not have backlight control (NVIDIA in this case)
>    is not able to control the backlight at any point while the system
>    is in operation. On similar hybrid systems with an EC-controlled
>    backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>    does not register its backlight handler. It has not been determined
>    whether the non-functional handler registered by the NVIDIA driver
>    is due to another firmware bug, or a bug in the NVIDIA driver.
>
> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
> device, it takes precedence over the BACKLIGHT_RAW devices registered
> by the GPU drivers. This in turn leads to backlight control appearing
> to be non-functional until after completing a suspend/resume cycle.
> However, it is still possible to control the backlight through direct
> interaction with the working GPU driver's backlight sysfs interface.
>
> These systems also appear to have a second firmware bug which resets
> the EC's brightness level to 100% on resume, but leaves the state in
> the kernel at the pre-suspend level. This causes attempts to save
> and restore the backlight level across the suspend/resume cycle to
> fail, due to the level appearing not to change even though it did.
>
> In order to work around these issues, add a quirk table to detect
> systems that are known to show these behaviors. So far, there is
> only one known system that requires these workarounds, and both
> issues are present on that system, but the quirks are tracked
> separately to make it easier to add them to other systems which
> may exhibit one of the bugs, but not the other. The original systems
> that this driver was tested on during development do not exhibit
> either of these quirks.
>
> If a system with the "GPU driver has backlight control" quirk is
> detected, nvidia-wmi-ec-backlight will grab a reference to the working
> (when freshly booted) GPU backlight handler and relays any backlight
> brightness level change requests directed at the EC to also be applied
> to the GPU backlight interface. This leads to redundant updates
> directed at the GPU backlight driver after a suspend/resume cycle, but
> it does allow the EC backlight control to work when the system is
> freshly booted.
>
> If a system with the "backlight level reset to full on resume" quirk
> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
> reset the backlight to the previous level upon resume.
>
> These workarounds are also plumbed through to kernel module parameters,
> to make it easier for users who suspect they may be affected by one or
> both of these bugs to test whether these workarounds are effective on
> their systems as well.
>
> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
> ---
> Note: the Tested-by: line above applies to the previous version of this
> patch; an explicit ACK from the tester is required for it to apply to
> the current version.
>
> v2:
>   * Add readable sysfs files for module params, use linear interpolation
>     from fixp-arith.h, fix return value of notifier callback, use devm_*()
>     for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
>   * Add comment to denote known firmware versions that exhibit the bugs.
>     (Mario Limonciello <Mario.Limonciello@amd.com>)
>   * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)
>
>   .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
>   1 file changed, 194 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> index 61e37194df70..95e1ddf780fc 100644
> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> @@ -3,8 +3,12 @@
>    * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
>    */
>   
> +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
> +
>   #include <linux/acpi.h>
>   #include <linux/backlight.h>
> +#include <linux/dmi.h>
> +#include <linux/fixp-arith.h>
>   #include <linux/mod_devicetable.h>
>   #include <linux/module.h>
>   #include <linux/types.h>
> @@ -75,6 +79,73 @@ struct wmi_brightness_args {
>   	u32 ignored[3];
>   };
>   
> +/**
> + * struct nvidia_wmi_ec_backlight_priv - driver private data
> + * @bl_dev:       the associated backlight device
> + * @proxy_target: backlight device which receives relayed brightness changes
> + * @notifier:     notifier block for resume callback
> + */
> +struct nvidia_wmi_ec_backlight_priv {
> +	struct backlight_device *bl_dev;
> +	struct backlight_device *proxy_target;
> +	struct notifier_block nb;
> +};
> +
> +static char *backlight_proxy_target;
> +module_param(backlight_proxy_target, charp, 0444);
> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
> +
> +static int max_reprobe_attempts = 128;
> +module_param(max_reprobe_attempts, int, 0444);
> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
> +
> +static bool restore_level_on_resume;
> +module_param(restore_level_on_resume, bool, 0444);
> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
> +
> +/* Bit field values for quirks table */
> +
> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
> +
> +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
> +
> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
> +
> +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
> +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
> +
> +static int assign_quirks(const struct dmi_system_id *id)
> +{
> +	if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
> +		restore_level_on_resume = 1;
> +
> +	/* If the module parameter is set, override the quirks table */
> +	if (!backlight_proxy_target) {
> +		if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
> +			backlight_proxy_target = "amdgpu_bl1";
> +	}
> +
> +	return true;
> +}
> +
> +#define QUIRK_ENTRY(vendor, product, quirks) {          \
> +	.callback = assign_quirks,                      \
> +	.matches = {                                    \
> +		DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
> +		DMI_MATCH(DMI_PRODUCT_VERSION, product) \
> +	},                                              \
> +	.driver_data = (void *)(quirks)                 \
> +}
> +
> +static const struct dmi_system_id quirks_table[] = {
> +	QUIRK_ENTRY(
> +		/* This quirk is preset as of firmware revision HACN31WW */
> +		"LENOVO", "Legion S7 15ACH6",
> +		QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
> +	),
> +	{ }
> +};
> +
>   /**
>    * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
>    * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
> @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
>   	return 0;
>   }
>   
> +/* Scale the current brightness level of 'from' to the range of 'to'. */
> +static int scale_backlight_level(const struct backlight_device *from,
> +				 const struct backlight_device *to)
> +{
> +	int from_max = from->props.max_brightness;
> +	int from_level = from->props.brightness;
> +	int to_max = to->props.max_brightness;
> +
> +	return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
> +}
> +
>   static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
>   {
>   	struct wmi_device *wdev = bl_get_data(bd);
> +	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> +	struct backlight_device *proxy_target = priv->proxy_target;
> +
> +	if (proxy_target) {
> +		int level = scale_backlight_level(bd, proxy_target);
> +
> +		if (backlight_device_set_brightness(proxy_target, level))
> +			pr_warn("Failed to relay backlight update to \"%s\"",
> +				backlight_proxy_target);
> +	}
>   
>   	return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
>   	                             WMI_BRIGHTNESS_MODE_SET,
> @@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
>   	.get_brightness = nvidia_wmi_ec_backlight_get_brightness,
>   };
>   
> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
> +{
> +
> +	/*
> +	 * On some systems, the EC backlight level gets reset to 100% when
> +	 * resuming from suspend, but the backlight device state still reflects
> +	 * the pre-suspend value. Refresh the existing state to sync the EC's
> +	 * state back up with the kernel's.
> +	 */
> +	if (event == PM_POST_SUSPEND) {
> +		struct nvidia_wmi_ec_backlight_priv *p;
> +		int ret;
> +
> +		p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
> +		ret = backlight_update_status(p->bl_dev);
> +
> +		if (ret)
> +			pr_warn("failed to refresh backlight level: %d", ret);
> +
> +		return NOTIFY_OK;
> +	}
> +
> +	return NOTIFY_DONE;
> +}
> +
> +static void putdev(void *data)
> +{
> +	struct device *dev = data;
> +
> +	put_device(dev);
> +}
> +
>   static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
>   {
> +	struct backlight_device *bdev, *target = NULL;
> +	struct nvidia_wmi_ec_backlight_priv *priv;
>   	struct backlight_properties props = {};
> -	struct backlight_device *bdev;
>   	u32 source;
>   	int ret;
>   
> +	/*
> +	 * Check quirks tables to see if this system needs any of the firmware
> +	 * bug workarounds.
> +	 */
> +	dmi_check_system(quirks_table);
> +
> +	if (backlight_proxy_target && backlight_proxy_target[0]) {
> +		static int num_reprobe_attempts;
> +
> +		target = backlight_device_get_by_name(backlight_proxy_target);
> +
> +		if (target) {
> +			ret = devm_add_action_or_reset(&wdev->dev, putdev,
> +						       &target->dev);
> +			if (ret)
> +				return ret;
> +		} else {
> +			/*
> +			 * The target backlight device might not be ready;
> +			 * try again and disable backlight proxying if it
> +			 * fails too many times.
> +			 */
> +			if (num_reprobe_attempts < max_reprobe_attempts) {
> +				num_reprobe_attempts++;
> +				return -EPROBE_DEFER;
> +			}
> +
> +			pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
> +				backlight_proxy_target, max_reprobe_attempts);
> +		}
> +	}
> +
>   	ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>   	                           WMI_BRIGHTNESS_MODE_GET, &source);
>   	if (ret)
> @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
>   					      &wdev->dev, wdev,
>   					      &nvidia_wmi_ec_backlight_ops,
>   					      &props);
> -	return PTR_ERR_OR_ZERO(bdev);
> +
> +	if (IS_ERR(bdev))
> +		return PTR_ERR(bdev);
> +
> +	priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
> +	if (!priv)
> +		return -ENOMEM;
> +
> +	priv->bl_dev = bdev;
> +
> +	dev_set_drvdata(&wdev->dev, priv);
> +
> +	if (target) {
> +		int level = scale_backlight_level(target, bdev);
> +
> +		if (backlight_device_set_brightness(bdev, level))
> +			pr_warn("Unable to import initial brightness level from %s.",
> +				backlight_proxy_target);
> +		priv->proxy_target = target;
> +	}
> +
> +	if (restore_level_on_resume) {
> +		priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
> +		register_pm_notifier(&priv->nb);
> +	}
> +
> +	return 0;
> +}
> +
> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
> +{
> +	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> +
> +	if (priv->nb.notifier_call)
> +		unregister_pm_notifier(&priv->nb);
>   }
>   
>   #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
> @@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
>   		.name = "nvidia-wmi-ec-backlight",
>   	},
>   	.probe = nvidia_wmi_ec_backlight_probe,
> +	.remove = nvidia_wmi_ec_backlight_remove,
>   	.id_table = nvidia_wmi_ec_backlight_id_table,
>   };
>   module_wmi_driver(nvidia_wmi_ec_backlight_driver);

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16 21:28       ` Daniel Dadap
@ 2022-03-16 22:09         ` Alexandru Dinu
  2022-03-16 22:14           ` Alexandru Dinu
  2023-01-30 22:00           ` Daniel Dadap
  0 siblings, 2 replies; 31+ messages in thread
From: Alexandru Dinu @ 2022-03-16 22:09 UTC (permalink / raw)
  To: platform-driver-x86
  Cc: Daniel Dadap, Barnabás Pőcze, Hans de Goede, markgross,
	Limonciello, Mario, Deucher, Alexander

> Note: the Tested-by: line above applies to the previous version of this
> patch; an explicit ACK from the tester is required for it to apply to
> the current version.

I compiled and tested v2 on 5.16.14.
Everything works as expected: brightness control & level restore work
both on first boot and on subsequent sleep/resume cycles.

Regards,
Alex



On Wed, 16 Mar 2022 at 23:28, Daniel Dadap <ddadap@nvidia.com> wrote:
>
> Sorry, just noticed a typo in a comment:
>
> /* This quirk is preset as of firmware revision HACN31WW */
>
> Obviously that is meant to read "present". I'll fix that with the next
> round of changes, assuming there will be additional review feedback.
>
> On 3/16/22 15:33, Daniel Dadap wrote:
> > Some notebook systems with EC-driven backlight control appear to have a
> > firmware bug which causes the system to use GPU-driven backlight control
> > upon a fresh boot, but then switches to EC-driven backlight control
> > after completing a suspend/resume cycle. All the while, the firmware
> > reports that the backlight is under EC control, regardless of what is
> > actually controlling the backlight brightness.
> >
> > This leads to the following behavior:
> >
> > * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
> >    WMI-wrapped ACPI method erroneously reporting EC control.
> > * nvidia-wmi-ec-backlight does not work until after a suspend/resume
> >    cycle, due to the backlight control actually being GPU-driven.
> > * GPU drivers also register their own backlight handlers: in the case
> >    of the notebook system where this behavior has been observed, both
> >    amdgpu and the NVIDIA proprietary driver register backlight handlers.
> > * The GPU which has backlight control upon a fresh boot (amdgpu in the
> >    case observed so far) can successfully control the backlight through
> >    its backlight driver's sysfs interface, but stops working after the
> >    first suspend/resume cycle.
> > * nvidia-wmi-ec-backlight is unable to control the backlight upon a
> >    fresh boot, but begins to work after the first suspend/resume cycle.
> > * The GPU which does not have backlight control (NVIDIA in this case)
> >    is not able to control the backlight at any point while the system
> >    is in operation. On similar hybrid systems with an EC-controlled
> >    backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
> >    does not register its backlight handler. It has not been determined
> >    whether the non-functional handler registered by the NVIDIA driver
> >    is due to another firmware bug, or a bug in the NVIDIA driver.
> >
> > Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
> > device, it takes precedence over the BACKLIGHT_RAW devices registered
> > by the GPU drivers. This in turn leads to backlight control appearing
> > to be non-functional until after completing a suspend/resume cycle.
> > However, it is still possible to control the backlight through direct
> > interaction with the working GPU driver's backlight sysfs interface.
> >
> > These systems also appear to have a second firmware bug which resets
> > the EC's brightness level to 100% on resume, but leaves the state in
> > the kernel at the pre-suspend level. This causes attempts to save
> > and restore the backlight level across the suspend/resume cycle to
> > fail, due to the level appearing not to change even though it did.
> >
> > In order to work around these issues, add a quirk table to detect
> > systems that are known to show these behaviors. So far, there is
> > only one known system that requires these workarounds, and both
> > issues are present on that system, but the quirks are tracked
> > separately to make it easier to add them to other systems which
> > may exhibit one of the bugs, but not the other. The original systems
> > that this driver was tested on during development do not exhibit
> > either of these quirks.
> >
> > If a system with the "GPU driver has backlight control" quirk is
> > detected, nvidia-wmi-ec-backlight will grab a reference to the working
> > (when freshly booted) GPU backlight handler and relays any backlight
> > brightness level change requests directed at the EC to also be applied
> > to the GPU backlight interface. This leads to redundant updates
> > directed at the GPU backlight driver after a suspend/resume cycle, but
> > it does allow the EC backlight control to work when the system is
> > freshly booted.
> >
> > If a system with the "backlight level reset to full on resume" quirk
> > is detected, nvidia-wmi-ec-backlight will register a PM notifier to
> > reset the backlight to the previous level upon resume.
> >
> > These workarounds are also plumbed through to kernel module parameters,
> > to make it easier for users who suspect they may be affected by one or
> > both of these bugs to test whether these workarounds are effective on
> > their systems as well.
> >
> > Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
> > Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
> > ---
> > Note: the Tested-by: line above applies to the previous version of this
> > patch; an explicit ACK from the tester is required for it to apply to
> > the current version.
> >
> > v2:
> >   * Add readable sysfs files for module params, use linear interpolation
> >     from fixp-arith.h, fix return value of notifier callback, use devm_*()
> >     for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
> >   * Add comment to denote known firmware versions that exhibit the bugs.
> >     (Mario Limonciello <Mario.Limonciello@amd.com>)
> >   * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)
> >
> >   .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
> >   1 file changed, 194 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> > index 61e37194df70..95e1ddf780fc 100644
> > --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> > +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> > @@ -3,8 +3,12 @@
> >    * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
> >    */
> >
> > +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
> > +
> >   #include <linux/acpi.h>
> >   #include <linux/backlight.h>
> > +#include <linux/dmi.h>
> > +#include <linux/fixp-arith.h>
> >   #include <linux/mod_devicetable.h>
> >   #include <linux/module.h>
> >   #include <linux/types.h>
> > @@ -75,6 +79,73 @@ struct wmi_brightness_args {
> >       u32 ignored[3];
> >   };
> >
> > +/**
> > + * struct nvidia_wmi_ec_backlight_priv - driver private data
> > + * @bl_dev:       the associated backlight device
> > + * @proxy_target: backlight device which receives relayed brightness changes
> > + * @notifier:     notifier block for resume callback
> > + */
> > +struct nvidia_wmi_ec_backlight_priv {
> > +     struct backlight_device *bl_dev;
> > +     struct backlight_device *proxy_target;
> > +     struct notifier_block nb;
> > +};
> > +
> > +static char *backlight_proxy_target;
> > +module_param(backlight_proxy_target, charp, 0444);
> > +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
> > +
> > +static int max_reprobe_attempts = 128;
> > +module_param(max_reprobe_attempts, int, 0444);
> > +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
> > +
> > +static bool restore_level_on_resume;
> > +module_param(restore_level_on_resume, bool, 0444);
> > +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
> > +
> > +/* Bit field values for quirks table */
> > +
> > +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
> > +
> > +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
> > +
> > +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
> > +
> > +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
> > +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
> > +
> > +static int assign_quirks(const struct dmi_system_id *id)
> > +{
> > +     if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
> > +             restore_level_on_resume = 1;
> > +
> > +     /* If the module parameter is set, override the quirks table */
> > +     if (!backlight_proxy_target) {
> > +             if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
> > +                     backlight_proxy_target = "amdgpu_bl1";
> > +     }
> > +
> > +     return true;
> > +}
> > +
> > +#define QUIRK_ENTRY(vendor, product, quirks) {          \
> > +     .callback = assign_quirks,                      \
> > +     .matches = {                                    \
> > +             DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
> > +             DMI_MATCH(DMI_PRODUCT_VERSION, product) \
> > +     },                                              \
> > +     .driver_data = (void *)(quirks)                 \
> > +}
> > +
> > +static const struct dmi_system_id quirks_table[] = {
> > +     QUIRK_ENTRY(
> > +             /* This quirk is preset as of firmware revision HACN31WW */
> > +             "LENOVO", "Legion S7 15ACH6",
> > +             QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
> > +     ),
> > +     { }
> > +};
> > +
> >   /**
> >    * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
> >    * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
> > @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
> >       return 0;
> >   }
> >
> > +/* Scale the current brightness level of 'from' to the range of 'to'. */
> > +static int scale_backlight_level(const struct backlight_device *from,
> > +                              const struct backlight_device *to)
> > +{
> > +     int from_max = from->props.max_brightness;
> > +     int from_level = from->props.brightness;
> > +     int to_max = to->props.max_brightness;
> > +
> > +     return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
> > +}
> > +
> >   static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
> >   {
> >       struct wmi_device *wdev = bl_get_data(bd);
> > +     struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> > +     struct backlight_device *proxy_target = priv->proxy_target;
> > +
> > +     if (proxy_target) {
> > +             int level = scale_backlight_level(bd, proxy_target);
> > +
> > +             if (backlight_device_set_brightness(proxy_target, level))
> > +                     pr_warn("Failed to relay backlight update to \"%s\"",
> > +                             backlight_proxy_target);
> > +     }
> >
> >       return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
> >                                    WMI_BRIGHTNESS_MODE_SET,
> > @@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
> >       .get_brightness = nvidia_wmi_ec_backlight_get_brightness,
> >   };
> >
> > +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
> > +{
> > +
> > +     /*
> > +      * On some systems, the EC backlight level gets reset to 100% when
> > +      * resuming from suspend, but the backlight device state still reflects
> > +      * the pre-suspend value. Refresh the existing state to sync the EC's
> > +      * state back up with the kernel's.
> > +      */
> > +     if (event == PM_POST_SUSPEND) {
> > +             struct nvidia_wmi_ec_backlight_priv *p;
> > +             int ret;
> > +
> > +             p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
> > +             ret = backlight_update_status(p->bl_dev);
> > +
> > +             if (ret)
> > +                     pr_warn("failed to refresh backlight level: %d", ret);
> > +
> > +             return NOTIFY_OK;
> > +     }
> > +
> > +     return NOTIFY_DONE;
> > +}
> > +
> > +static void putdev(void *data)
> > +{
> > +     struct device *dev = data;
> > +
> > +     put_device(dev);
> > +}
> > +
> >   static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
> >   {
> > +     struct backlight_device *bdev, *target = NULL;
> > +     struct nvidia_wmi_ec_backlight_priv *priv;
> >       struct backlight_properties props = {};
> > -     struct backlight_device *bdev;
> >       u32 source;
> >       int ret;
> >
> > +     /*
> > +      * Check quirks tables to see if this system needs any of the firmware
> > +      * bug workarounds.
> > +      */
> > +     dmi_check_system(quirks_table);
> > +
> > +     if (backlight_proxy_target && backlight_proxy_target[0]) {
> > +             static int num_reprobe_attempts;
> > +
> > +             target = backlight_device_get_by_name(backlight_proxy_target);
> > +
> > +             if (target) {
> > +                     ret = devm_add_action_or_reset(&wdev->dev, putdev,
> > +                                                    &target->dev);
> > +                     if (ret)
> > +                             return ret;
> > +             } else {
> > +                     /*
> > +                      * The target backlight device might not be ready;
> > +                      * try again and disable backlight proxying if it
> > +                      * fails too many times.
> > +                      */
> > +                     if (num_reprobe_attempts < max_reprobe_attempts) {
> > +                             num_reprobe_attempts++;
> > +                             return -EPROBE_DEFER;
> > +                     }
> > +
> > +                     pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
> > +                             backlight_proxy_target, max_reprobe_attempts);
> > +             }
> > +     }
> > +
> >       ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
> >                                  WMI_BRIGHTNESS_MODE_GET, &source);
> >       if (ret)
> > @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
> >                                             &wdev->dev, wdev,
> >                                             &nvidia_wmi_ec_backlight_ops,
> >                                             &props);
> > -     return PTR_ERR_OR_ZERO(bdev);
> > +
> > +     if (IS_ERR(bdev))
> > +             return PTR_ERR(bdev);
> > +
> > +     priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
> > +     if (!priv)
> > +             return -ENOMEM;
> > +
> > +     priv->bl_dev = bdev;
> > +
> > +     dev_set_drvdata(&wdev->dev, priv);
> > +
> > +     if (target) {
> > +             int level = scale_backlight_level(target, bdev);
> > +
> > +             if (backlight_device_set_brightness(bdev, level))
> > +                     pr_warn("Unable to import initial brightness level from %s.",
> > +                             backlight_proxy_target);
> > +             priv->proxy_target = target;
> > +     }
> > +
> > +     if (restore_level_on_resume) {
> > +             priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
> > +             register_pm_notifier(&priv->nb);
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
> > +{
> > +     struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> > +
> > +     if (priv->nb.notifier_call)
> > +             unregister_pm_notifier(&priv->nb);
> >   }
> >
> >   #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
> > @@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
> >               .name = "nvidia-wmi-ec-backlight",
> >       },
> >       .probe = nvidia_wmi_ec_backlight_probe,
> > +     .remove = nvidia_wmi_ec_backlight_remove,
> >       .id_table = nvidia_wmi_ec_backlight_id_table,
> >   };
> >   module_wmi_driver(nvidia_wmi_ec_backlight_driver);

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16 22:09         ` Alexandru Dinu
@ 2022-03-16 22:14           ` Alexandru Dinu
  2023-01-30 22:00           ` Daniel Dadap
  1 sibling, 0 replies; 31+ messages in thread
From: Alexandru Dinu @ 2022-03-16 22:14 UTC (permalink / raw)
  To: platform-driver-x86

> Note: the Tested-by: line above applies to the previous version of this
> patch; an explicit ACK from the tester is required for it to apply to
> the current version.

I compiled and tested v2 on 5.16.14.
Everything works as expected: brightness control & level restore work
both on first boot and on subsequent sleep/resume cycles.

Regards,
Alex

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16 20:33     ` [PATCH v2] " Daniel Dadap
  2022-03-16 21:28       ` Daniel Dadap
@ 2022-03-17 12:17       ` Hans de Goede
  2022-03-17 13:28         ` Daniel Dadap
  1 sibling, 1 reply; 31+ messages in thread
From: Hans de Goede @ 2022-03-17 12:17 UTC (permalink / raw)
  To: Daniel Dadap, platform-driver-x86
  Cc: pobrn, markgross, Mario.Limonciello, Alexandru Dinu

Hi,

On 3/16/22 21:33, Daniel Dadap wrote:
> Some notebook systems with EC-driven backlight control appear to have a
> firmware bug which causes the system to use GPU-driven backlight control
> upon a fresh boot, but then switches to EC-driven backlight control
> after completing a suspend/resume cycle. All the while, the firmware
> reports that the backlight is under EC control, regardless of what is
> actually controlling the backlight brightness.
> 
> This leads to the following behavior:
> 
> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>   WMI-wrapped ACPI method erroneously reporting EC control.
> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>   cycle, due to the backlight control actually being GPU-driven.
> * GPU drivers also register their own backlight handlers: in the case
>   of the notebook system where this behavior has been observed, both
>   amdgpu and the NVIDIA proprietary driver register backlight handlers.
> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>   case observed so far) can successfully control the backlight through
>   its backlight driver's sysfs interface, but stops working after the
>   first suspend/resume cycle.
> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>   fresh boot, but begins to work after the first suspend/resume cycle.
> * The GPU which does not have backlight control (NVIDIA in this case)
>   is not able to control the backlight at any point while the system
>   is in operation. On similar hybrid systems with an EC-controlled
>   backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>   does not register its backlight handler. It has not been determined
>   whether the non-functional handler registered by the NVIDIA driver
>   is due to another firmware bug, or a bug in the NVIDIA driver.
> 
> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
> device, it takes precedence over the BACKLIGHT_RAW devices registered
> by the GPU drivers. This in turn leads to backlight control appearing
> to be non-functional until after completing a suspend/resume cycle.
> However, it is still possible to control the backlight through direct
> interaction with the working GPU driver's backlight sysfs interface.
> 
> These systems also appear to have a second firmware bug which resets
> the EC's brightness level to 100% on resume, but leaves the state in
> the kernel at the pre-suspend level. This causes attempts to save
> and restore the backlight level across the suspend/resume cycle to
> fail, due to the level appearing not to change even though it did.
> 
> In order to work around these issues, add a quirk table to detect
> systems that are known to show these behaviors. So far, there is
> only one known system that requires these workarounds, and both
> issues are present on that system, but the quirks are tracked
> separately to make it easier to add them to other systems which
> may exhibit one of the bugs, but not the other. The original systems
> that this driver was tested on during development do not exhibit
> either of these quirks.
> 
> If a system with the "GPU driver has backlight control" quirk is
> detected, nvidia-wmi-ec-backlight will grab a reference to the working
> (when freshly booted) GPU backlight handler and relays any backlight
> brightness level change requests directed at the EC to also be applied
> to the GPU backlight interface. This leads to redundant updates
> directed at the GPU backlight driver after a suspend/resume cycle, but
> it does allow the EC backlight control to work when the system is
> freshly booted.

Ugh, I'm really not a fan of the backlight proxy plan here. I have
plans to clean-up the whole x86 backlight mess soon and an important part
of that is to stop registering multiple backlight interfaces for the
same panel/screen.

Where as going with this workaround requires us to have 2 active
backlight interfaces active. Also this will very likely work to
(subtly) different backlight behavior before and after the first
suspend/resume.

Is there no other way to solve this issue? Maybe we need to poke
vgaswitcheroo to set the current GPU mode even though this is
already reported as active to get things to switch to the ECs
control right away ?

I'm pretty certain that Windows is not doing this backlight proxying,
IMHO we need to figure out what causes the switch after suspend/resume
and then do that thing at boot.

Regards,

Hans



> 
> If a system with the "backlight level reset to full on resume" quirk
> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
> reset the backlight to the previous level upon resume.
> 
> These workarounds are also plumbed through to kernel module parameters,
> to make it easier for users who suspect they may be affected by one or
> both of these bugs to test whether these workarounds are effective on
> their systems as well.
> 
> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
> ---
> Note: the Tested-by: line above applies to the previous version of this
> patch; an explicit ACK from the tester is required for it to apply to
> the current version.
> 
> v2:
>  * Add readable sysfs files for module params, use linear interpolation
>    from fixp-arith.h, fix return value of notifier callback, use devm_*()
>    for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
>  * Add comment to denote known firmware versions that exhibit the bugs.
>    (Mario Limonciello <Mario.Limonciello@amd.com>)
>  * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)
> 
>  .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
>  1 file changed, 194 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> index 61e37194df70..95e1ddf780fc 100644
> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> @@ -3,8 +3,12 @@
>   * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
>   */
>  
> +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
> +
>  #include <linux/acpi.h>
>  #include <linux/backlight.h>
> +#include <linux/dmi.h>
> +#include <linux/fixp-arith.h>
>  #include <linux/mod_devicetable.h>
>  #include <linux/module.h>
>  #include <linux/types.h>
> @@ -75,6 +79,73 @@ struct wmi_brightness_args {
>  	u32 ignored[3];
>  };
>  
> +/**
> + * struct nvidia_wmi_ec_backlight_priv - driver private data
> + * @bl_dev:       the associated backlight device
> + * @proxy_target: backlight device which receives relayed brightness changes
> + * @notifier:     notifier block for resume callback
> + */
> +struct nvidia_wmi_ec_backlight_priv {
> +	struct backlight_device *bl_dev;
> +	struct backlight_device *proxy_target;
> +	struct notifier_block nb;
> +};
> +
> +static char *backlight_proxy_target;
> +module_param(backlight_proxy_target, charp, 0444);
> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
> +
> +static int max_reprobe_attempts = 128;
> +module_param(max_reprobe_attempts, int, 0444);
> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
> +
> +static bool restore_level_on_resume;
> +module_param(restore_level_on_resume, bool, 0444);
> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
> +
> +/* Bit field values for quirks table */
> +
> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
> +
> +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
> +
> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
> +
> +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
> +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
> +
> +static int assign_quirks(const struct dmi_system_id *id)
> +{
> +	if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
> +		restore_level_on_resume = 1;
> +
> +	/* If the module parameter is set, override the quirks table */
> +	if (!backlight_proxy_target) {
> +		if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
> +			backlight_proxy_target = "amdgpu_bl1";
> +	}
> +
> +	return true;
> +}
> +
> +#define QUIRK_ENTRY(vendor, product, quirks) {          \
> +	.callback = assign_quirks,                      \
> +	.matches = {                                    \
> +		DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
> +		DMI_MATCH(DMI_PRODUCT_VERSION, product) \
> +	},                                              \
> +	.driver_data = (void *)(quirks)                 \
> +}
> +
> +static const struct dmi_system_id quirks_table[] = {
> +	QUIRK_ENTRY(
> +		/* This quirk is preset as of firmware revision HACN31WW */
> +		"LENOVO", "Legion S7 15ACH6",
> +		QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
> +	),
> +	{ }
> +};
> +
>  /**
>   * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
>   * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
> @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
>  	return 0;
>  }
>  
> +/* Scale the current brightness level of 'from' to the range of 'to'. */
> +static int scale_backlight_level(const struct backlight_device *from,
> +				 const struct backlight_device *to)
> +{
> +	int from_max = from->props.max_brightness;
> +	int from_level = from->props.brightness;
> +	int to_max = to->props.max_brightness;
> +
> +	return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
> +}
> +
>  static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
>  {
>  	struct wmi_device *wdev = bl_get_data(bd);
> +	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> +	struct backlight_device *proxy_target = priv->proxy_target;
> +
> +	if (proxy_target) {
> +		int level = scale_backlight_level(bd, proxy_target);
> +
> +		if (backlight_device_set_brightness(proxy_target, level))
> +			pr_warn("Failed to relay backlight update to \"%s\"",
> +				backlight_proxy_target);
> +	}
>  
>  	return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
>  	                             WMI_BRIGHTNESS_MODE_SET,
> @@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
>  	.get_brightness = nvidia_wmi_ec_backlight_get_brightness,
>  };
>  
> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
> +{
> +
> +	/*
> +	 * On some systems, the EC backlight level gets reset to 100% when
> +	 * resuming from suspend, but the backlight device state still reflects
> +	 * the pre-suspend value. Refresh the existing state to sync the EC's
> +	 * state back up with the kernel's.
> +	 */
> +	if (event == PM_POST_SUSPEND) {
> +		struct nvidia_wmi_ec_backlight_priv *p;
> +		int ret;
> +
> +		p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
> +		ret = backlight_update_status(p->bl_dev);
> +
> +		if (ret)
> +			pr_warn("failed to refresh backlight level: %d", ret);
> +
> +		return NOTIFY_OK;
> +	}
> +
> +	return NOTIFY_DONE;
> +}
> +
> +static void putdev(void *data)
> +{
> +	struct device *dev = data;
> +
> +	put_device(dev);
> +}
> +
>  static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
>  {
> +	struct backlight_device *bdev, *target = NULL;
> +	struct nvidia_wmi_ec_backlight_priv *priv;
>  	struct backlight_properties props = {};
> -	struct backlight_device *bdev;
>  	u32 source;
>  	int ret;
>  
> +	/*
> +	 * Check quirks tables to see if this system needs any of the firmware
> +	 * bug workarounds.
> +	 */
> +	dmi_check_system(quirks_table);
> +
> +	if (backlight_proxy_target && backlight_proxy_target[0]) {
> +		static int num_reprobe_attempts;
> +
> +		target = backlight_device_get_by_name(backlight_proxy_target);
> +
> +		if (target) {
> +			ret = devm_add_action_or_reset(&wdev->dev, putdev,
> +						       &target->dev);
> +			if (ret)
> +				return ret;
> +		} else {
> +			/*
> +			 * The target backlight device might not be ready;
> +			 * try again and disable backlight proxying if it
> +			 * fails too many times.
> +			 */
> +			if (num_reprobe_attempts < max_reprobe_attempts) {
> +				num_reprobe_attempts++;
> +				return -EPROBE_DEFER;
> +			}
> +
> +			pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
> +				backlight_proxy_target, max_reprobe_attempts);
> +		}
> +	}
> +
>  	ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>  	                           WMI_BRIGHTNESS_MODE_GET, &source);
>  	if (ret)
> @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
>  					      &wdev->dev, wdev,
>  					      &nvidia_wmi_ec_backlight_ops,
>  					      &props);
> -	return PTR_ERR_OR_ZERO(bdev);
> +
> +	if (IS_ERR(bdev))
> +		return PTR_ERR(bdev);
> +
> +	priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
> +	if (!priv)
> +		return -ENOMEM;
> +
> +	priv->bl_dev = bdev;
> +
> +	dev_set_drvdata(&wdev->dev, priv);
> +
> +	if (target) {
> +		int level = scale_backlight_level(target, bdev);
> +
> +		if (backlight_device_set_brightness(bdev, level))
> +			pr_warn("Unable to import initial brightness level from %s.",
> +				backlight_proxy_target);
> +		priv->proxy_target = target;
> +	}
> +
> +	if (restore_level_on_resume) {
> +		priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
> +		register_pm_notifier(&priv->nb);
> +	}
> +
> +	return 0;
> +}
> +
> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
> +{
> +	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> +
> +	if (priv->nb.notifier_call)
> +		unregister_pm_notifier(&priv->nb);
>  }
>  
>  #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
> @@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
>  		.name = "nvidia-wmi-ec-backlight",
>  	},
>  	.probe = nvidia_wmi_ec_backlight_probe,
> +	.remove = nvidia_wmi_ec_backlight_remove,
>  	.id_table = nvidia_wmi_ec_backlight_id_table,
>  };
>  module_wmi_driver(nvidia_wmi_ec_backlight_driver);


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-17 12:17       ` Hans de Goede
@ 2022-03-17 13:28         ` Daniel Dadap
  2022-03-17 16:42             ` Hans de Goede
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel Dadap @ 2022-03-17 13:28 UTC (permalink / raw)
  To: Hans de Goede
  Cc: platform-driver-x86, pobrn, markgross, Mario.Limonciello, Alexandru Dinu


> On Mar 17, 2022, at 07:17, Hans de Goede <hdegoede@redhat.com> wrote:
> 
> Hi,
> 
>> On 3/16/22 21:33, Daniel Dadap wrote:
>> Some notebook systems with EC-driven backlight control appear to have a
>> firmware bug which causes the system to use GPU-driven backlight control
>> upon a fresh boot, but then switches to EC-driven backlight control
>> after completing a suspend/resume cycle. All the while, the firmware
>> reports that the backlight is under EC control, regardless of what is
>> actually controlling the backlight brightness.
>> 
>> This leads to the following behavior:
>> 
>> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>>  WMI-wrapped ACPI method erroneously reporting EC control.
>> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>>  cycle, due to the backlight control actually being GPU-driven.
>> * GPU drivers also register their own backlight handlers: in the case
>>  of the notebook system where this behavior has been observed, both
>>  amdgpu and the NVIDIA proprietary driver register backlight handlers.
>> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>>  case observed so far) can successfully control the backlight through
>>  its backlight driver's sysfs interface, but stops working after the
>>  first suspend/resume cycle.
>> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>>  fresh boot, but begins to work after the first suspend/resume cycle.
>> * The GPU which does not have backlight control (NVIDIA in this case)
>>  is not able to control the backlight at any point while the system
>>  is in operation. On similar hybrid systems with an EC-controlled
>>  backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>>  does not register its backlight handler. It has not been determined
>>  whether the non-functional handler registered by the NVIDIA driver
>>  is due to another firmware bug, or a bug in the NVIDIA driver.
>> 
>> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
>> device, it takes precedence over the BACKLIGHT_RAW devices registered
>> by the GPU drivers. This in turn leads to backlight control appearing
>> to be non-functional until after completing a suspend/resume cycle.
>> However, it is still possible to control the backlight through direct
>> interaction with the working GPU driver's backlight sysfs interface.
>> 
>> These systems also appear to have a second firmware bug which resets
>> the EC's brightness level to 100% on resume, but leaves the state in
>> the kernel at the pre-suspend level. This causes attempts to save
>> and restore the backlight level across the suspend/resume cycle to
>> fail, due to the level appearing not to change even though it did.
>> 
>> In order to work around these issues, add a quirk table to detect
>> systems that are known to show these behaviors. So far, there is
>> only one known system that requires these workarounds, and both
>> issues are present on that system, but the quirks are tracked
>> separately to make it easier to add them to other systems which
>> may exhibit one of the bugs, but not the other. The original systems
>> that this driver was tested on during development do not exhibit
>> either of these quirks.
>> 
>> If a system with the "GPU driver has backlight control" quirk is
>> detected, nvidia-wmi-ec-backlight will grab a reference to the working
>> (when freshly booted) GPU backlight handler and relays any backlight
>> brightness level change requests directed at the EC to also be applied
>> to the GPU backlight interface. This leads to redundant updates
>> directed at the GPU backlight driver after a suspend/resume cycle, but
>> it does allow the EC backlight control to work when the system is
>> freshly booted.
> 
> Ugh, I'm really not a fan of the backlight proxy plan here. I have
> plans to clean-up the whole x86 backlight mess soon and an important part
> of that is to stop registering multiple backlight interfaces for the
> same panel/screen.
> 
> Where as going with this workaround requires us to have 2 active
> backlight interfaces active. Also this will very likely work to
> (subtly) different backlight behavior before and after the first
> suspend/resume.

I understand. Having multiple backlight devices for the same panel is indeed annoying. Out of curiosity, what is the plan for determining that multiple backlight interfaces are all supposed to control the same panel? I’m not sure I’m aware of any driver-agnostic ways of identifying a particular panel instance uniquely, and if there is one, that would actually help with something else I’m working on at the moment. If the idea involves e.g. the EDID, that could be troublesome for backlight drivers such as this one which aren’t associated with any GPU.

This also gives me an idea for another experiment I didn’t think to try earlier. Alex: what happens if you hack amdgpu_atombios_encoder_init_backlight() in the amdgpu driver to just return right away? I wonder if the AMD GPU’s attempt to take over backlight control is what makes the firmware give control to the GPU rather than the EC initially.

Regardless of the backlight proxy workaround, does the force refresh one seem reasonable? That one at least addresses a condition that happens at every suspend/resume cycle.

> Is there no other way to solve this issue? Maybe we need to poke
> vgaswitcheroo to set the current GPU mode even though this is
> already reported as active to get things to switch to the ECs
> control right away ?

There isn’t a vgaswitcheroo handler for this particular mux device (yet), but there are separate ACPI interfaces for the mux itself. Poking the mux *shouldn’t* have any effect on what device is controlling the backlight for the panel, since when the system is in dynamic mux mode the EC is always supposed to be in control, but this system is already showing some weird behavior, so it doesn’t hurt to try.

> I'm pretty certain that Windows is not doing this backlight proxying,
> IMHO we need to figure out what causes the switch after suspend/resume
> and then do that thing at boot.

I’ll put together an instrumented driver for Alex to try on his system, to capture some more data and see if I can get some more insight into that. I also have a dump of his ACPI tables, and can check there for some other potential leads. Hopefully whatever changes the state across the suspend/resume cycle is response to something that Linux does or doesn’t do, and not some state that is only handled internally within the firmware.

> Regards,
> 
> Hans
> 
> 
> 
>> 
>> If a system with the "backlight level reset to full on resume" quirk
>> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
>> reset the backlight to the previous level upon resume.
>> 
>> These workarounds are also plumbed through to kernel module parameters,
>> to make it easier for users who suspect they may be affected by one or
>> both of these bugs to test whether these workarounds are effective on
>> their systems as well.
>> 
>> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
>> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
>> ---
>> Note: the Tested-by: line above applies to the previous version of this
>> patch; an explicit ACK from the tester is required for it to apply to
>> the current version.
>> 
>> v2:
>> * Add readable sysfs files for module params, use linear interpolation
>>   from fixp-arith.h, fix return value of notifier callback, use devm_*()
>>   for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
>> * Add comment to denote known firmware versions that exhibit the bugs.
>>   (Mario Limonciello <Mario.Limonciello@amd.com>)
>> * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)
>> 
>> .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
>> 1 file changed, 194 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>> index 61e37194df70..95e1ddf780fc 100644
>> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>> @@ -3,8 +3,12 @@
>>  * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
>>  */
>> 
>> +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
>> +
>> #include <linux/acpi.h>
>> #include <linux/backlight.h>
>> +#include <linux/dmi.h>
>> +#include <linux/fixp-arith.h>
>> #include <linux/mod_devicetable.h>
>> #include <linux/module.h>
>> #include <linux/types.h>
>> @@ -75,6 +79,73 @@ struct wmi_brightness_args {
>>    u32 ignored[3];
>> };
>> 
>> +/**
>> + * struct nvidia_wmi_ec_backlight_priv - driver private data
>> + * @bl_dev:       the associated backlight device
>> + * @proxy_target: backlight device which receives relayed brightness changes
>> + * @notifier:     notifier block for resume callback
>> + */
>> +struct nvidia_wmi_ec_backlight_priv {
>> +    struct backlight_device *bl_dev;
>> +    struct backlight_device *proxy_target;
>> +    struct notifier_block nb;
>> +};
>> +
>> +static char *backlight_proxy_target;
>> +module_param(backlight_proxy_target, charp, 0444);
>> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
>> +
>> +static int max_reprobe_attempts = 128;
>> +module_param(max_reprobe_attempts, int, 0444);
>> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
>> +
>> +static bool restore_level_on_resume;
>> +module_param(restore_level_on_resume, bool, 0444);
>> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
>> +
>> +/* Bit field values for quirks table */
>> +
>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
>> +
>> +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
>> +
>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
>> +
>> +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
>> +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
>> +
>> +static int assign_quirks(const struct dmi_system_id *id)
>> +{
>> +    if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
>> +        restore_level_on_resume = 1;
>> +
>> +    /* If the module parameter is set, override the quirks table */
>> +    if (!backlight_proxy_target) {
>> +        if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
>> +            backlight_proxy_target = "amdgpu_bl1";
>> +    }
>> +
>> +    return true;
>> +}
>> +
>> +#define QUIRK_ENTRY(vendor, product, quirks) {          \
>> +    .callback = assign_quirks,                      \
>> +    .matches = {                                    \
>> +        DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
>> +        DMI_MATCH(DMI_PRODUCT_VERSION, product) \
>> +    },                                              \
>> +    .driver_data = (void *)(quirks)                 \
>> +}
>> +
>> +static const struct dmi_system_id quirks_table[] = {
>> +    QUIRK_ENTRY(
>> +        /* This quirk is preset as of firmware revision HACN31WW */
>> +        "LENOVO", "Legion S7 15ACH6",
>> +        QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
>> +    ),
>> +    { }
>> +};
>> +
>> /**
>>  * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
>>  * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
>> @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
>>    return 0;
>> }
>> 
>> +/* Scale the current brightness level of 'from' to the range of 'to'. */
>> +static int scale_backlight_level(const struct backlight_device *from,
>> +                 const struct backlight_device *to)
>> +{
>> +    int from_max = from->props.max_brightness;
>> +    int from_level = from->props.brightness;
>> +    int to_max = to->props.max_brightness;
>> +
>> +    return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
>> +}
>> +
>> static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
>> {
>>    struct wmi_device *wdev = bl_get_data(bd);
>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>> +    struct backlight_device *proxy_target = priv->proxy_target;
>> +
>> +    if (proxy_target) {
>> +        int level = scale_backlight_level(bd, proxy_target);
>> +
>> +        if (backlight_device_set_brightness(proxy_target, level))
>> +            pr_warn("Failed to relay backlight update to \"%s\"",
>> +                backlight_proxy_target);
>> +    }
>> 
>>    return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
>>                                 WMI_BRIGHTNESS_MODE_SET,
>> @@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
>>    .get_brightness = nvidia_wmi_ec_backlight_get_brightness,
>> };
>> 
>> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
>> +{
>> +
>> +    /*
>> +     * On some systems, the EC backlight level gets reset to 100% when
>> +     * resuming from suspend, but the backlight device state still reflects
>> +     * the pre-suspend value. Refresh the existing state to sync the EC's
>> +     * state back up with the kernel's.
>> +     */
>> +    if (event == PM_POST_SUSPEND) {
>> +        struct nvidia_wmi_ec_backlight_priv *p;
>> +        int ret;
>> +
>> +        p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
>> +        ret = backlight_update_status(p->bl_dev);
>> +
>> +        if (ret)
>> +            pr_warn("failed to refresh backlight level: %d", ret);
>> +
>> +        return NOTIFY_OK;
>> +    }
>> +
>> +    return NOTIFY_DONE;
>> +}
>> +
>> +static void putdev(void *data)
>> +{
>> +    struct device *dev = data;
>> +
>> +    put_device(dev);
>> +}
>> +
>> static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
>> {
>> +    struct backlight_device *bdev, *target = NULL;
>> +    struct nvidia_wmi_ec_backlight_priv *priv;
>>    struct backlight_properties props = {};
>> -    struct backlight_device *bdev;
>>    u32 source;
>>    int ret;
>> 
>> +    /*
>> +     * Check quirks tables to see if this system needs any of the firmware
>> +     * bug workarounds.
>> +     */
>> +    dmi_check_system(quirks_table);
>> +
>> +    if (backlight_proxy_target && backlight_proxy_target[0]) {
>> +        static int num_reprobe_attempts;
>> +
>> +        target = backlight_device_get_by_name(backlight_proxy_target);
>> +
>> +        if (target) {
>> +            ret = devm_add_action_or_reset(&wdev->dev, putdev,
>> +                               &target->dev);
>> +            if (ret)
>> +                return ret;
>> +        } else {
>> +            /*
>> +             * The target backlight device might not be ready;
>> +             * try again and disable backlight proxying if it
>> +             * fails too many times.
>> +             */
>> +            if (num_reprobe_attempts < max_reprobe_attempts) {
>> +                num_reprobe_attempts++;
>> +                return -EPROBE_DEFER;
>> +            }
>> +
>> +            pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
>> +                backlight_proxy_target, max_reprobe_attempts);
>> +        }
>> +    }
>> +
>>    ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>>                               WMI_BRIGHTNESS_MODE_GET, &source);
>>    if (ret)
>> @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
>>                          &wdev->dev, wdev,
>>                          &nvidia_wmi_ec_backlight_ops,
>>                          &props);
>> -    return PTR_ERR_OR_ZERO(bdev);
>> +
>> +    if (IS_ERR(bdev))
>> +        return PTR_ERR(bdev);
>> +
>> +    priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
>> +    if (!priv)
>> +        return -ENOMEM;
>> +
>> +    priv->bl_dev = bdev;
>> +
>> +    dev_set_drvdata(&wdev->dev, priv);
>> +
>> +    if (target) {
>> +        int level = scale_backlight_level(target, bdev);
>> +
>> +        if (backlight_device_set_brightness(bdev, level))
>> +            pr_warn("Unable to import initial brightness level from %s.",
>> +                backlight_proxy_target);
>> +        priv->proxy_target = target;
>> +    }
>> +
>> +    if (restore_level_on_resume) {
>> +        priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
>> +        register_pm_notifier(&priv->nb);
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
>> +{
>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>> +
>> +    if (priv->nb.notifier_call)
>> +        unregister_pm_notifier(&priv->nb);
>> }
>> 
>> #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
>> @@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
>>        .name = "nvidia-wmi-ec-backlight",
>>    },
>>    .probe = nvidia_wmi_ec_backlight_probe,
>> +    .remove = nvidia_wmi_ec_backlight_remove,
>>    .id_table = nvidia_wmi_ec_backlight_id_table,
>> };
>> module_wmi_driver(nvidia_wmi_ec_backlight_driver);
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-17 13:28         ` Daniel Dadap
@ 2022-03-17 16:42             ` Hans de Goede
  0 siblings, 0 replies; 31+ messages in thread
From: Hans de Goede @ 2022-03-17 16:42 UTC (permalink / raw)
  To: Daniel Dadap
  Cc: platform-driver-x86, pobrn, markgross, Mario.Limonciello,
	Alexandru Dinu, dri-devel, Daniel Vetter

Hi Daniel,

On 3/17/22 14:28, Daniel Dadap wrote:
> 
>> On Mar 17, 2022, at 07:17, Hans de Goede <hdegoede@redhat.com> wrote:
>>
>> Hi,
>>
>>> On 3/16/22 21:33, Daniel Dadap wrote:
>>> Some notebook systems with EC-driven backlight control appear to have a
>>> firmware bug which causes the system to use GPU-driven backlight control
>>> upon a fresh boot, but then switches to EC-driven backlight control
>>> after completing a suspend/resume cycle. All the while, the firmware
>>> reports that the backlight is under EC control, regardless of what is
>>> actually controlling the backlight brightness.
>>>
>>> This leads to the following behavior:
>>>
>>> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>>>  WMI-wrapped ACPI method erroneously reporting EC control.
>>> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>>>  cycle, due to the backlight control actually being GPU-driven.
>>> * GPU drivers also register their own backlight handlers: in the case
>>>  of the notebook system where this behavior has been observed, both
>>>  amdgpu and the NVIDIA proprietary driver register backlight handlers.
>>> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>>>  case observed so far) can successfully control the backlight through
>>>  its backlight driver's sysfs interface, but stops working after the
>>>  first suspend/resume cycle.
>>> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>>>  fresh boot, but begins to work after the first suspend/resume cycle.
>>> * The GPU which does not have backlight control (NVIDIA in this case)
>>>  is not able to control the backlight at any point while the system
>>>  is in operation. On similar hybrid systems with an EC-controlled
>>>  backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>>>  does not register its backlight handler. It has not been determined
>>>  whether the non-functional handler registered by the NVIDIA driver
>>>  is due to another firmware bug, or a bug in the NVIDIA driver.
>>>
>>> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
>>> device, it takes precedence over the BACKLIGHT_RAW devices registered
>>> by the GPU drivers. This in turn leads to backlight control appearing
>>> to be non-functional until after completing a suspend/resume cycle.
>>> However, it is still possible to control the backlight through direct
>>> interaction with the working GPU driver's backlight sysfs interface.
>>>
>>> These systems also appear to have a second firmware bug which resets
>>> the EC's brightness level to 100% on resume, but leaves the state in
>>> the kernel at the pre-suspend level. This causes attempts to save
>>> and restore the backlight level across the suspend/resume cycle to
>>> fail, due to the level appearing not to change even though it did.
>>>
>>> In order to work around these issues, add a quirk table to detect
>>> systems that are known to show these behaviors. So far, there is
>>> only one known system that requires these workarounds, and both
>>> issues are present on that system, but the quirks are tracked
>>> separately to make it easier to add them to other systems which
>>> may exhibit one of the bugs, but not the other. The original systems
>>> that this driver was tested on during development do not exhibit
>>> either of these quirks.
>>>
>>> If a system with the "GPU driver has backlight control" quirk is
>>> detected, nvidia-wmi-ec-backlight will grab a reference to the working
>>> (when freshly booted) GPU backlight handler and relays any backlight
>>> brightness level change requests directed at the EC to also be applied
>>> to the GPU backlight interface. This leads to redundant updates
>>> directed at the GPU backlight driver after a suspend/resume cycle, but
>>> it does allow the EC backlight control to work when the system is
>>> freshly booted.
>>
>> Ugh, I'm really not a fan of the backlight proxy plan here. I have
>> plans to clean-up the whole x86 backlight mess soon and an important part
>> of that is to stop registering multiple backlight interfaces for the
>> same panel/screen.
>>
>> Where as going with this workaround requires us to have 2 active
>> backlight interfaces active. Also this will very likely work to
>> (subtly) different backlight behavior before and after the first
>> suspend/resume.
> 
> I understand. Having multiple backlight devices for the same panel is indeed annoying. Out of curiosity, what is the plan for determining that multiple backlight interfaces are all supposed to control the same panel?

ATM the kernel basically only supports a bunch of different methods
to control the backlight of 1 internal panel. The plan is to tie this
to the panel from a userspace pov by making the brightness +
max_brightness properties on the drm_connector object for the
internal-panel.

The in kernel tying of the backlight device to the internal panel
will be done hardcoded inside the drm driver(s) based on the
drivers already knowing which connector is the internal panel.

This all naively assumes there is only 1 internal panel, which
for the majority of cases is true. My plan for devices with
2 internal panels is to cross that bridge when we get there
(I expect those mostly in phone/tablet like devices for now
which will likely use devicetree where solving this is trivial).

I do realize we will eventually get some x86/acpi device with
2 internal panels. Hopefully we can just figure out what
the Windows drivers are doing there and parse e.g. the ACPI
info which Windows is using for this.

As part of the move to properties on the drm_connector object
the /sys/class/backlight interface will become deprecated,
but will be kept for backward compat and will eventually
be put behind a Kconfig option.

The kernel internal backlight_device stuff will be kept
since we need some internal representation anyways and
I don't see much value in reworking that, esp. since
we need to have /sys/class/backlight backward compat.

Note this is all based on discussions which I had with
mainly Daniel Vetter @plumbers 2019 in Lisabon. I have
never gotten around to actually start working on this,
but this has resurfaced recently and I plan to actually
take a stab at implementing this plan sometime during 2022.

> I’m not sure I’m aware of any driver-agnostic ways of identifying a particular panel instance uniquely, and if there is one, that would actually help with something else I’m working on at the moment. If the idea involves e.g. the EDID, that could be troublesome for backlight drivers such as this one which aren’t associated with any GPU.

Right, see above the main idea is to make this
"the kernel's problem" and I expect us to fix this in
the kernel in a variety of different ways depending on
the actual hardware.

As for "troublesome for backlight drivers such as this one
which aren’t associated with any GPU.", the idea is that:

1. E.g the i915 driver (which I have the most experience with)
knows which connector is the internal panel

2. The acpi_video_get_backlight_type() helper from
drivers/acpi/video_detect.c will get extended to make sure
that there is always only *1* /sys/class/backlight device.

To be specific atm code supporting old vendor specific backlight
fw interfaces, e.g. drivers/platform/x86/dell-laptop.c:
already does:

       if (acpi_video_get_backlight_type() != acpi_backlight_vendor)
                return 0;

And drivers/acpi/acpi_video.c also already does:

       if (acpi_video_get_backlight_type() != acpi_backlight_video)
                return 0;

Currently looking at the 3 main x86 backlight interfaces: vendor,
generic-ACPI and native-drm-driver, only the native driver's
backlight registers unconditionally. The plan is to make those also
do a similar check (*) and to also add special backlight drivers like
nvidia-wmi-ec-backlight and drivers/video/backlight/apple_bl.c
to this mechanism.

3. 1 + 2 means that the drm_driver can just tie the single
backlight_device which will be registered on the system to
the internal panel.

Again I'm completely ignoring dual-internal-panel devices here
for simplicity's sake.

Note this is getting a bit off-topic, but if you have insights
in this, or already can think of ways how this is not going to
work :)  please let me know.


*) And adding that check + the presence of nvidia-wmi-ec-backlight
support will make the native drm-driver not register it's
backlight_device at all at which point the backlight-proxy workaround
from this patch breaks.


> This also gives me an idea for another experiment I didn’t think to try earlier. Alex: what happens if you hack amdgpu_atombios_encoder_init_backlight() in the amdgpu driver to just return right away? I wonder if the AMD GPU’s attempt to take over backlight control is what makes the firmware give control to the GPU rather than the EC initially.
> 
> Regardless of the backlight proxy workaround, does the force refresh one seem reasonable? That one at least addresses a condition that happens at every suspend/resume cycle.

Good question, I must admit I stopped reading the patch after seeing
the proxy thing.

I see that you are using a pm_notifer for this. I wonder if you
can try (on your own system) to add a pm_ops struct and make
wmi_driver.driver.pm point to that and check if that gets called
by adding e.g. a pr_info (I don't see why it would not get called).

And assuming that works, using that would be a bit cleaner IMHO.
Although that does have resume-ordering implications. But I would
expect the EC to basically be always ready to get talked to at
the point in the resume cycle where normal (non early) resume
handlers are called.

To be clear the idea would be to always have the suspend handler
(so that the driver and pm_ops structs can be const) and to check
a quirk flag inside the resume handler. Or maybe even just always
read back the brightness from the hw and check if it has changed?
Does this need to be behind a quirk ?

>> Is there no other way to solve this issue? Maybe we need to poke
>> vgaswitcheroo to set the current GPU mode even though this is
>> already reported as active to get things to switch to the ECs
>> control right away ?
> 
> There isn’t a vgaswitcheroo handler for this particular mux device (yet), but there are separate ACPI interfaces for the mux itself. Poking the mux *shouldn’t* have any effect on what device is controlling the backlight for the panel, since when the system is in dynamic mux mode the EC is always supposed to be in control, but this system is already showing some weird behavior, so it doesn’t hurt to try.

Right, as you said the EC is always supposed to be in control, but
it is not. I would not be surprised if making the ACPI call to put
things in dynamic mode (even though they already are) fixes this,
assuming there is such an ACPI call...

>> I'm pretty certain that Windows is not doing this backlight proxying,
>> IMHO we need to figure out what causes the switch after suspend/resume
>> and then do that thing at boot.
> 
> I’ll put together an instrumented driver for Alex to try on his system, to capture some more data and see if I can get some more insight into that. I also have a dump of his ACPI tables, and can check there for some other potential leads. Hopefully whatever changes the state across the suspend/resume cycle is response to something that Linux does or doesn’t do, and not some state that is only handled internally within the firmware.

Great, thank you.

Regards,

Hans



>>> If a system with the "backlight level reset to full on resume" quirk
>>> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
>>> reset the backlight to the previous level upon resume.
>>>
>>> These workarounds are also plumbed through to kernel module parameters,
>>> to make it easier for users who suspect they may be affected by one or
>>> both of these bugs to test whether these workarounds are effective on
>>> their systems as well.
>>>
>>> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
>>> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
>>> ---
>>> Note: the Tested-by: line above applies to the previous version of this
>>> patch; an explicit ACK from the tester is required for it to apply to
>>> the current version.
>>>
>>> v2:
>>> * Add readable sysfs files for module params, use linear interpolation
>>>   from fixp-arith.h, fix return value of notifier callback, use devm_*()
>>>   for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
>>> * Add comment to denote known firmware versions that exhibit the bugs.
>>>   (Mario Limonciello <Mario.Limonciello@amd.com>)
>>> * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)
>>>
>>> .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
>>> 1 file changed, 194 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>> index 61e37194df70..95e1ddf780fc 100644
>>> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>> @@ -3,8 +3,12 @@
>>>  * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
>>>  */
>>>
>>> +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
>>> +
>>> #include <linux/acpi.h>
>>> #include <linux/backlight.h>
>>> +#include <linux/dmi.h>
>>> +#include <linux/fixp-arith.h>
>>> #include <linux/mod_devicetable.h>
>>> #include <linux/module.h>
>>> #include <linux/types.h>
>>> @@ -75,6 +79,73 @@ struct wmi_brightness_args {
>>>    u32 ignored[3];
>>> };
>>>
>>> +/**
>>> + * struct nvidia_wmi_ec_backlight_priv - driver private data
>>> + * @bl_dev:       the associated backlight device
>>> + * @proxy_target: backlight device which receives relayed brightness changes
>>> + * @notifier:     notifier block for resume callback
>>> + */
>>> +struct nvidia_wmi_ec_backlight_priv {
>>> +    struct backlight_device *bl_dev;
>>> +    struct backlight_device *proxy_target;
>>> +    struct notifier_block nb;
>>> +};
>>> +
>>> +static char *backlight_proxy_target;
>>> +module_param(backlight_proxy_target, charp, 0444);
>>> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
>>> +
>>> +static int max_reprobe_attempts = 128;
>>> +module_param(max_reprobe_attempts, int, 0444);
>>> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
>>> +
>>> +static bool restore_level_on_resume;
>>> +module_param(restore_level_on_resume, bool, 0444);
>>> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
>>> +
>>> +/* Bit field values for quirks table */
>>> +
>>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
>>> +
>>> +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
>>> +
>>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
>>> +
>>> +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
>>> +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
>>> +
>>> +static int assign_quirks(const struct dmi_system_id *id)
>>> +{
>>> +    if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
>>> +        restore_level_on_resume = 1;
>>> +
>>> +    /* If the module parameter is set, override the quirks table */
>>> +    if (!backlight_proxy_target) {
>>> +        if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
>>> +            backlight_proxy_target = "amdgpu_bl1";
>>> +    }
>>> +
>>> +    return true;
>>> +}
>>> +
>>> +#define QUIRK_ENTRY(vendor, product, quirks) {          \
>>> +    .callback = assign_quirks,                      \
>>> +    .matches = {                                    \
>>> +        DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
>>> +        DMI_MATCH(DMI_PRODUCT_VERSION, product) \
>>> +    },                                              \
>>> +    .driver_data = (void *)(quirks)                 \
>>> +}
>>> +
>>> +static const struct dmi_system_id quirks_table[] = {
>>> +    QUIRK_ENTRY(
>>> +        /* This quirk is preset as of firmware revision HACN31WW */
>>> +        "LENOVO", "Legion S7 15ACH6",
>>> +        QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
>>> +    ),
>>> +    { }
>>> +};
>>> +
>>> /**
>>>  * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
>>>  * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
>>> @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
>>>    return 0;
>>> }
>>>
>>> +/* Scale the current brightness level of 'from' to the range of 'to'. */
>>> +static int scale_backlight_level(const struct backlight_device *from,
>>> +                 const struct backlight_device *to)
>>> +{
>>> +    int from_max = from->props.max_brightness;
>>> +    int from_level = from->props.brightness;
>>> +    int to_max = to->props.max_brightness;
>>> +
>>> +    return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
>>> +}
>>> +
>>> static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
>>> {
>>>    struct wmi_device *wdev = bl_get_data(bd);
>>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>>> +    struct backlight_device *proxy_target = priv->proxy_target;
>>> +
>>> +    if (proxy_target) {
>>> +        int level = scale_backlight_level(bd, proxy_target);
>>> +
>>> +        if (backlight_device_set_brightness(proxy_target, level))
>>> +            pr_warn("Failed to relay backlight update to \"%s\"",
>>> +                backlight_proxy_target);
>>> +    }
>>>
>>>    return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
>>>                                 WMI_BRIGHTNESS_MODE_SET,
>>> @@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
>>>    .get_brightness = nvidia_wmi_ec_backlight_get_brightness,
>>> };
>>>
>>> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
>>> +{
>>> +
>>> +    /*
>>> +     * On some systems, the EC backlight level gets reset to 100% when
>>> +     * resuming from suspend, but the backlight device state still reflects
>>> +     * the pre-suspend value. Refresh the existing state to sync the EC's
>>> +     * state back up with the kernel's.
>>> +     */
>>> +    if (event == PM_POST_SUSPEND) {
>>> +        struct nvidia_wmi_ec_backlight_priv *p;
>>> +        int ret;
>>> +
>>> +        p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
>>> +        ret = backlight_update_status(p->bl_dev);
>>> +
>>> +        if (ret)
>>> +            pr_warn("failed to refresh backlight level: %d", ret);
>>> +
>>> +        return NOTIFY_OK;
>>> +    }
>>> +
>>> +    return NOTIFY_DONE;
>>> +}
>>> +
>>> +static void putdev(void *data)
>>> +{
>>> +    struct device *dev = data;
>>> +
>>> +    put_device(dev);
>>> +}
>>> +
>>> static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
>>> {
>>> +    struct backlight_device *bdev, *target = NULL;
>>> +    struct nvidia_wmi_ec_backlight_priv *priv;
>>>    struct backlight_properties props = {};
>>> -    struct backlight_device *bdev;
>>>    u32 source;
>>>    int ret;
>>>
>>> +    /*
>>> +     * Check quirks tables to see if this system needs any of the firmware
>>> +     * bug workarounds.
>>> +     */
>>> +    dmi_check_system(quirks_table);
>>> +
>>> +    if (backlight_proxy_target && backlight_proxy_target[0]) {
>>> +        static int num_reprobe_attempts;
>>> +
>>> +        target = backlight_device_get_by_name(backlight_proxy_target);
>>> +
>>> +        if (target) {
>>> +            ret = devm_add_action_or_reset(&wdev->dev, putdev,
>>> +                               &target->dev);
>>> +            if (ret)
>>> +                return ret;
>>> +        } else {
>>> +            /*
>>> +             * The target backlight device might not be ready;
>>> +             * try again and disable backlight proxying if it
>>> +             * fails too many times.
>>> +             */
>>> +            if (num_reprobe_attempts < max_reprobe_attempts) {
>>> +                num_reprobe_attempts++;
>>> +                return -EPROBE_DEFER;
>>> +            }
>>> +
>>> +            pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
>>> +                backlight_proxy_target, max_reprobe_attempts);
>>> +        }
>>> +    }
>>> +
>>>    ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>>>                               WMI_BRIGHTNESS_MODE_GET, &source);
>>>    if (ret)
>>> @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
>>>                          &wdev->dev, wdev,
>>>                          &nvidia_wmi_ec_backlight_ops,
>>>                          &props);
>>> -    return PTR_ERR_OR_ZERO(bdev);
>>> +
>>> +    if (IS_ERR(bdev))
>>> +        return PTR_ERR(bdev);
>>> +
>>> +    priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
>>> +    if (!priv)
>>> +        return -ENOMEM;
>>> +
>>> +    priv->bl_dev = bdev;
>>> +
>>> +    dev_set_drvdata(&wdev->dev, priv);
>>> +
>>> +    if (target) {
>>> +        int level = scale_backlight_level(target, bdev);
>>> +
>>> +        if (backlight_device_set_brightness(bdev, level))
>>> +            pr_warn("Unable to import initial brightness level from %s.",
>>> +                backlight_proxy_target);
>>> +        priv->proxy_target = target;
>>> +    }
>>> +
>>> +    if (restore_level_on_resume) {
>>> +        priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
>>> +        register_pm_notifier(&priv->nb);
>>> +    }
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
>>> +{
>>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>>> +
>>> +    if (priv->nb.notifier_call)
>>> +        unregister_pm_notifier(&priv->nb);
>>> }
>>>
>>> #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
>>> @@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
>>>        .name = "nvidia-wmi-ec-backlight",
>>>    },
>>>    .probe = nvidia_wmi_ec_backlight_probe,
>>> +    .remove = nvidia_wmi_ec_backlight_remove,
>>>    .id_table = nvidia_wmi_ec_backlight_id_table,
>>> };
>>> module_wmi_driver(nvidia_wmi_ec_backlight_driver);
>>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
@ 2022-03-17 16:42             ` Hans de Goede
  0 siblings, 0 replies; 31+ messages in thread
From: Hans de Goede @ 2022-03-17 16:42 UTC (permalink / raw)
  To: Daniel Dadap
  Cc: dri-devel, platform-driver-x86, markgross, pobrn, Alexandru Dinu,
	Mario.Limonciello

Hi Daniel,

On 3/17/22 14:28, Daniel Dadap wrote:
> 
>> On Mar 17, 2022, at 07:17, Hans de Goede <hdegoede@redhat.com> wrote:
>>
>> Hi,
>>
>>> On 3/16/22 21:33, Daniel Dadap wrote:
>>> Some notebook systems with EC-driven backlight control appear to have a
>>> firmware bug which causes the system to use GPU-driven backlight control
>>> upon a fresh boot, but then switches to EC-driven backlight control
>>> after completing a suspend/resume cycle. All the while, the firmware
>>> reports that the backlight is under EC control, regardless of what is
>>> actually controlling the backlight brightness.
>>>
>>> This leads to the following behavior:
>>>
>>> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>>>  WMI-wrapped ACPI method erroneously reporting EC control.
>>> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>>>  cycle, due to the backlight control actually being GPU-driven.
>>> * GPU drivers also register their own backlight handlers: in the case
>>>  of the notebook system where this behavior has been observed, both
>>>  amdgpu and the NVIDIA proprietary driver register backlight handlers.
>>> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>>>  case observed so far) can successfully control the backlight through
>>>  its backlight driver's sysfs interface, but stops working after the
>>>  first suspend/resume cycle.
>>> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>>>  fresh boot, but begins to work after the first suspend/resume cycle.
>>> * The GPU which does not have backlight control (NVIDIA in this case)
>>>  is not able to control the backlight at any point while the system
>>>  is in operation. On similar hybrid systems with an EC-controlled
>>>  backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>>>  does not register its backlight handler. It has not been determined
>>>  whether the non-functional handler registered by the NVIDIA driver
>>>  is due to another firmware bug, or a bug in the NVIDIA driver.
>>>
>>> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
>>> device, it takes precedence over the BACKLIGHT_RAW devices registered
>>> by the GPU drivers. This in turn leads to backlight control appearing
>>> to be non-functional until after completing a suspend/resume cycle.
>>> However, it is still possible to control the backlight through direct
>>> interaction with the working GPU driver's backlight sysfs interface.
>>>
>>> These systems also appear to have a second firmware bug which resets
>>> the EC's brightness level to 100% on resume, but leaves the state in
>>> the kernel at the pre-suspend level. This causes attempts to save
>>> and restore the backlight level across the suspend/resume cycle to
>>> fail, due to the level appearing not to change even though it did.
>>>
>>> In order to work around these issues, add a quirk table to detect
>>> systems that are known to show these behaviors. So far, there is
>>> only one known system that requires these workarounds, and both
>>> issues are present on that system, but the quirks are tracked
>>> separately to make it easier to add them to other systems which
>>> may exhibit one of the bugs, but not the other. The original systems
>>> that this driver was tested on during development do not exhibit
>>> either of these quirks.
>>>
>>> If a system with the "GPU driver has backlight control" quirk is
>>> detected, nvidia-wmi-ec-backlight will grab a reference to the working
>>> (when freshly booted) GPU backlight handler and relays any backlight
>>> brightness level change requests directed at the EC to also be applied
>>> to the GPU backlight interface. This leads to redundant updates
>>> directed at the GPU backlight driver after a suspend/resume cycle, but
>>> it does allow the EC backlight control to work when the system is
>>> freshly booted.
>>
>> Ugh, I'm really not a fan of the backlight proxy plan here. I have
>> plans to clean-up the whole x86 backlight mess soon and an important part
>> of that is to stop registering multiple backlight interfaces for the
>> same panel/screen.
>>
>> Where as going with this workaround requires us to have 2 active
>> backlight interfaces active. Also this will very likely work to
>> (subtly) different backlight behavior before and after the first
>> suspend/resume.
> 
> I understand. Having multiple backlight devices for the same panel is indeed annoying. Out of curiosity, what is the plan for determining that multiple backlight interfaces are all supposed to control the same panel?

ATM the kernel basically only supports a bunch of different methods
to control the backlight of 1 internal panel. The plan is to tie this
to the panel from a userspace pov by making the brightness +
max_brightness properties on the drm_connector object for the
internal-panel.

The in kernel tying of the backlight device to the internal panel
will be done hardcoded inside the drm driver(s) based on the
drivers already knowing which connector is the internal panel.

This all naively assumes there is only 1 internal panel, which
for the majority of cases is true. My plan for devices with
2 internal panels is to cross that bridge when we get there
(I expect those mostly in phone/tablet like devices for now
which will likely use devicetree where solving this is trivial).

I do realize we will eventually get some x86/acpi device with
2 internal panels. Hopefully we can just figure out what
the Windows drivers are doing there and parse e.g. the ACPI
info which Windows is using for this.

As part of the move to properties on the drm_connector object
the /sys/class/backlight interface will become deprecated,
but will be kept for backward compat and will eventually
be put behind a Kconfig option.

The kernel internal backlight_device stuff will be kept
since we need some internal representation anyways and
I don't see much value in reworking that, esp. since
we need to have /sys/class/backlight backward compat.

Note this is all based on discussions which I had with
mainly Daniel Vetter @plumbers 2019 in Lisabon. I have
never gotten around to actually start working on this,
but this has resurfaced recently and I plan to actually
take a stab at implementing this plan sometime during 2022.

> I’m not sure I’m aware of any driver-agnostic ways of identifying a particular panel instance uniquely, and if there is one, that would actually help with something else I’m working on at the moment. If the idea involves e.g. the EDID, that could be troublesome for backlight drivers such as this one which aren’t associated with any GPU.

Right, see above the main idea is to make this
"the kernel's problem" and I expect us to fix this in
the kernel in a variety of different ways depending on
the actual hardware.

As for "troublesome for backlight drivers such as this one
which aren’t associated with any GPU.", the idea is that:

1. E.g the i915 driver (which I have the most experience with)
knows which connector is the internal panel

2. The acpi_video_get_backlight_type() helper from
drivers/acpi/video_detect.c will get extended to make sure
that there is always only *1* /sys/class/backlight device.

To be specific atm code supporting old vendor specific backlight
fw interfaces, e.g. drivers/platform/x86/dell-laptop.c:
already does:

       if (acpi_video_get_backlight_type() != acpi_backlight_vendor)
                return 0;

And drivers/acpi/acpi_video.c also already does:

       if (acpi_video_get_backlight_type() != acpi_backlight_video)
                return 0;

Currently looking at the 3 main x86 backlight interfaces: vendor,
generic-ACPI and native-drm-driver, only the native driver's
backlight registers unconditionally. The plan is to make those also
do a similar check (*) and to also add special backlight drivers like
nvidia-wmi-ec-backlight and drivers/video/backlight/apple_bl.c
to this mechanism.

3. 1 + 2 means that the drm_driver can just tie the single
backlight_device which will be registered on the system to
the internal panel.

Again I'm completely ignoring dual-internal-panel devices here
for simplicity's sake.

Note this is getting a bit off-topic, but if you have insights
in this, or already can think of ways how this is not going to
work :)  please let me know.


*) And adding that check + the presence of nvidia-wmi-ec-backlight
support will make the native drm-driver not register it's
backlight_device at all at which point the backlight-proxy workaround
from this patch breaks.


> This also gives me an idea for another experiment I didn’t think to try earlier. Alex: what happens if you hack amdgpu_atombios_encoder_init_backlight() in the amdgpu driver to just return right away? I wonder if the AMD GPU’s attempt to take over backlight control is what makes the firmware give control to the GPU rather than the EC initially.
> 
> Regardless of the backlight proxy workaround, does the force refresh one seem reasonable? That one at least addresses a condition that happens at every suspend/resume cycle.

Good question, I must admit I stopped reading the patch after seeing
the proxy thing.

I see that you are using a pm_notifer for this. I wonder if you
can try (on your own system) to add a pm_ops struct and make
wmi_driver.driver.pm point to that and check if that gets called
by adding e.g. a pr_info (I don't see why it would not get called).

And assuming that works, using that would be a bit cleaner IMHO.
Although that does have resume-ordering implications. But I would
expect the EC to basically be always ready to get talked to at
the point in the resume cycle where normal (non early) resume
handlers are called.

To be clear the idea would be to always have the suspend handler
(so that the driver and pm_ops structs can be const) and to check
a quirk flag inside the resume handler. Or maybe even just always
read back the brightness from the hw and check if it has changed?
Does this need to be behind a quirk ?

>> Is there no other way to solve this issue? Maybe we need to poke
>> vgaswitcheroo to set the current GPU mode even though this is
>> already reported as active to get things to switch to the ECs
>> control right away ?
> 
> There isn’t a vgaswitcheroo handler for this particular mux device (yet), but there are separate ACPI interfaces for the mux itself. Poking the mux *shouldn’t* have any effect on what device is controlling the backlight for the panel, since when the system is in dynamic mux mode the EC is always supposed to be in control, but this system is already showing some weird behavior, so it doesn’t hurt to try.

Right, as you said the EC is always supposed to be in control, but
it is not. I would not be surprised if making the ACPI call to put
things in dynamic mode (even though they already are) fixes this,
assuming there is such an ACPI call...

>> I'm pretty certain that Windows is not doing this backlight proxying,
>> IMHO we need to figure out what causes the switch after suspend/resume
>> and then do that thing at boot.
> 
> I’ll put together an instrumented driver for Alex to try on his system, to capture some more data and see if I can get some more insight into that. I also have a dump of his ACPI tables, and can check there for some other potential leads. Hopefully whatever changes the state across the suspend/resume cycle is response to something that Linux does or doesn’t do, and not some state that is only handled internally within the firmware.

Great, thank you.

Regards,

Hans



>>> If a system with the "backlight level reset to full on resume" quirk
>>> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
>>> reset the backlight to the previous level upon resume.
>>>
>>> These workarounds are also plumbed through to kernel module parameters,
>>> to make it easier for users who suspect they may be affected by one or
>>> both of these bugs to test whether these workarounds are effective on
>>> their systems as well.
>>>
>>> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
>>> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
>>> ---
>>> Note: the Tested-by: line above applies to the previous version of this
>>> patch; an explicit ACK from the tester is required for it to apply to
>>> the current version.
>>>
>>> v2:
>>> * Add readable sysfs files for module params, use linear interpolation
>>>   from fixp-arith.h, fix return value of notifier callback, use devm_*()
>>>   for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
>>> * Add comment to denote known firmware versions that exhibit the bugs.
>>>   (Mario Limonciello <Mario.Limonciello@amd.com>)
>>> * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)
>>>
>>> .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
>>> 1 file changed, 194 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>> index 61e37194df70..95e1ddf780fc 100644
>>> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>> @@ -3,8 +3,12 @@
>>>  * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
>>>  */
>>>
>>> +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
>>> +
>>> #include <linux/acpi.h>
>>> #include <linux/backlight.h>
>>> +#include <linux/dmi.h>
>>> +#include <linux/fixp-arith.h>
>>> #include <linux/mod_devicetable.h>
>>> #include <linux/module.h>
>>> #include <linux/types.h>
>>> @@ -75,6 +79,73 @@ struct wmi_brightness_args {
>>>    u32 ignored[3];
>>> };
>>>
>>> +/**
>>> + * struct nvidia_wmi_ec_backlight_priv - driver private data
>>> + * @bl_dev:       the associated backlight device
>>> + * @proxy_target: backlight device which receives relayed brightness changes
>>> + * @notifier:     notifier block for resume callback
>>> + */
>>> +struct nvidia_wmi_ec_backlight_priv {
>>> +    struct backlight_device *bl_dev;
>>> +    struct backlight_device *proxy_target;
>>> +    struct notifier_block nb;
>>> +};
>>> +
>>> +static char *backlight_proxy_target;
>>> +module_param(backlight_proxy_target, charp, 0444);
>>> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
>>> +
>>> +static int max_reprobe_attempts = 128;
>>> +module_param(max_reprobe_attempts, int, 0444);
>>> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
>>> +
>>> +static bool restore_level_on_resume;
>>> +module_param(restore_level_on_resume, bool, 0444);
>>> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
>>> +
>>> +/* Bit field values for quirks table */
>>> +
>>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
>>> +
>>> +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
>>> +
>>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
>>> +
>>> +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
>>> +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
>>> +
>>> +static int assign_quirks(const struct dmi_system_id *id)
>>> +{
>>> +    if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
>>> +        restore_level_on_resume = 1;
>>> +
>>> +    /* If the module parameter is set, override the quirks table */
>>> +    if (!backlight_proxy_target) {
>>> +        if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
>>> +            backlight_proxy_target = "amdgpu_bl1";
>>> +    }
>>> +
>>> +    return true;
>>> +}
>>> +
>>> +#define QUIRK_ENTRY(vendor, product, quirks) {          \
>>> +    .callback = assign_quirks,                      \
>>> +    .matches = {                                    \
>>> +        DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
>>> +        DMI_MATCH(DMI_PRODUCT_VERSION, product) \
>>> +    },                                              \
>>> +    .driver_data = (void *)(quirks)                 \
>>> +}
>>> +
>>> +static const struct dmi_system_id quirks_table[] = {
>>> +    QUIRK_ENTRY(
>>> +        /* This quirk is preset as of firmware revision HACN31WW */
>>> +        "LENOVO", "Legion S7 15ACH6",
>>> +        QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
>>> +    ),
>>> +    { }
>>> +};
>>> +
>>> /**
>>>  * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
>>>  * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
>>> @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
>>>    return 0;
>>> }
>>>
>>> +/* Scale the current brightness level of 'from' to the range of 'to'. */
>>> +static int scale_backlight_level(const struct backlight_device *from,
>>> +                 const struct backlight_device *to)
>>> +{
>>> +    int from_max = from->props.max_brightness;
>>> +    int from_level = from->props.brightness;
>>> +    int to_max = to->props.max_brightness;
>>> +
>>> +    return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
>>> +}
>>> +
>>> static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
>>> {
>>>    struct wmi_device *wdev = bl_get_data(bd);
>>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>>> +    struct backlight_device *proxy_target = priv->proxy_target;
>>> +
>>> +    if (proxy_target) {
>>> +        int level = scale_backlight_level(bd, proxy_target);
>>> +
>>> +        if (backlight_device_set_brightness(proxy_target, level))
>>> +            pr_warn("Failed to relay backlight update to \"%s\"",
>>> +                backlight_proxy_target);
>>> +    }
>>>
>>>    return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
>>>                                 WMI_BRIGHTNESS_MODE_SET,
>>> @@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
>>>    .get_brightness = nvidia_wmi_ec_backlight_get_brightness,
>>> };
>>>
>>> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
>>> +{
>>> +
>>> +    /*
>>> +     * On some systems, the EC backlight level gets reset to 100% when
>>> +     * resuming from suspend, but the backlight device state still reflects
>>> +     * the pre-suspend value. Refresh the existing state to sync the EC's
>>> +     * state back up with the kernel's.
>>> +     */
>>> +    if (event == PM_POST_SUSPEND) {
>>> +        struct nvidia_wmi_ec_backlight_priv *p;
>>> +        int ret;
>>> +
>>> +        p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
>>> +        ret = backlight_update_status(p->bl_dev);
>>> +
>>> +        if (ret)
>>> +            pr_warn("failed to refresh backlight level: %d", ret);
>>> +
>>> +        return NOTIFY_OK;
>>> +    }
>>> +
>>> +    return NOTIFY_DONE;
>>> +}
>>> +
>>> +static void putdev(void *data)
>>> +{
>>> +    struct device *dev = data;
>>> +
>>> +    put_device(dev);
>>> +}
>>> +
>>> static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
>>> {
>>> +    struct backlight_device *bdev, *target = NULL;
>>> +    struct nvidia_wmi_ec_backlight_priv *priv;
>>>    struct backlight_properties props = {};
>>> -    struct backlight_device *bdev;
>>>    u32 source;
>>>    int ret;
>>>
>>> +    /*
>>> +     * Check quirks tables to see if this system needs any of the firmware
>>> +     * bug workarounds.
>>> +     */
>>> +    dmi_check_system(quirks_table);
>>> +
>>> +    if (backlight_proxy_target && backlight_proxy_target[0]) {
>>> +        static int num_reprobe_attempts;
>>> +
>>> +        target = backlight_device_get_by_name(backlight_proxy_target);
>>> +
>>> +        if (target) {
>>> +            ret = devm_add_action_or_reset(&wdev->dev, putdev,
>>> +                               &target->dev);
>>> +            if (ret)
>>> +                return ret;
>>> +        } else {
>>> +            /*
>>> +             * The target backlight device might not be ready;
>>> +             * try again and disable backlight proxying if it
>>> +             * fails too many times.
>>> +             */
>>> +            if (num_reprobe_attempts < max_reprobe_attempts) {
>>> +                num_reprobe_attempts++;
>>> +                return -EPROBE_DEFER;
>>> +            }
>>> +
>>> +            pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
>>> +                backlight_proxy_target, max_reprobe_attempts);
>>> +        }
>>> +    }
>>> +
>>>    ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>>>                               WMI_BRIGHTNESS_MODE_GET, &source);
>>>    if (ret)
>>> @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
>>>                          &wdev->dev, wdev,
>>>                          &nvidia_wmi_ec_backlight_ops,
>>>                          &props);
>>> -    return PTR_ERR_OR_ZERO(bdev);
>>> +
>>> +    if (IS_ERR(bdev))
>>> +        return PTR_ERR(bdev);
>>> +
>>> +    priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
>>> +    if (!priv)
>>> +        return -ENOMEM;
>>> +
>>> +    priv->bl_dev = bdev;
>>> +
>>> +    dev_set_drvdata(&wdev->dev, priv);
>>> +
>>> +    if (target) {
>>> +        int level = scale_backlight_level(target, bdev);
>>> +
>>> +        if (backlight_device_set_brightness(bdev, level))
>>> +            pr_warn("Unable to import initial brightness level from %s.",
>>> +                backlight_proxy_target);
>>> +        priv->proxy_target = target;
>>> +    }
>>> +
>>> +    if (restore_level_on_resume) {
>>> +        priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
>>> +        register_pm_notifier(&priv->nb);
>>> +    }
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
>>> +{
>>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>>> +
>>> +    if (priv->nb.notifier_call)
>>> +        unregister_pm_notifier(&priv->nb);
>>> }
>>>
>>> #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
>>> @@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
>>>        .name = "nvidia-wmi-ec-backlight",
>>>    },
>>>    .probe = nvidia_wmi_ec_backlight_probe,
>>> +    .remove = nvidia_wmi_ec_backlight_remove,
>>>    .id_table = nvidia_wmi_ec_backlight_id_table,
>>> };
>>> module_wmi_driver(nvidia_wmi_ec_backlight_driver);
>>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-17 16:42             ` Hans de Goede
  (?)
@ 2022-03-17 17:35             ` Alex Deucher
  2022-03-17 18:50                 ` Daniel Dadap
  -1 siblings, 1 reply; 31+ messages in thread
From: Alex Deucher @ 2022-03-17 17:35 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Daniel Dadap, dri-devel, platform-driver-x86, markgross, pobrn,
	Alexandru Dinu, Mario.Limonciello

On Thu, Mar 17, 2022 at 12:42 PM Hans de Goede <hdegoede@redhat.com> wrote:
>
> Hi Daniel,
>
> On 3/17/22 14:28, Daniel Dadap wrote:
> >
> >> On Mar 17, 2022, at 07:17, Hans de Goede <hdegoede@redhat.com> wrote:
> >>
> >> Hi,
> >>
> >>> On 3/16/22 21:33, Daniel Dadap wrote:
> >>> Some notebook systems with EC-driven backlight control appear to have a
> >>> firmware bug which causes the system to use GPU-driven backlight control
> >>> upon a fresh boot, but then switches to EC-driven backlight control
> >>> after completing a suspend/resume cycle. All the while, the firmware
> >>> reports that the backlight is under EC control, regardless of what is
> >>> actually controlling the backlight brightness.
> >>>
> >>> This leads to the following behavior:
> >>>
> >>> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
> >>>  WMI-wrapped ACPI method erroneously reporting EC control.
> >>> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
> >>>  cycle, due to the backlight control actually being GPU-driven.
> >>> * GPU drivers also register their own backlight handlers: in the case
> >>>  of the notebook system where this behavior has been observed, both
> >>>  amdgpu and the NVIDIA proprietary driver register backlight handlers.
> >>> * The GPU which has backlight control upon a fresh boot (amdgpu in the
> >>>  case observed so far) can successfully control the backlight through
> >>>  its backlight driver's sysfs interface, but stops working after the
> >>>  first suspend/resume cycle.
> >>> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
> >>>  fresh boot, but begins to work after the first suspend/resume cycle.
> >>> * The GPU which does not have backlight control (NVIDIA in this case)
> >>>  is not able to control the backlight at any point while the system
> >>>  is in operation. On similar hybrid systems with an EC-controlled
> >>>  backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
> >>>  does not register its backlight handler. It has not been determined
> >>>  whether the non-functional handler registered by the NVIDIA driver
> >>>  is due to another firmware bug, or a bug in the NVIDIA driver.
> >>>
> >>> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
> >>> device, it takes precedence over the BACKLIGHT_RAW devices registered
> >>> by the GPU drivers. This in turn leads to backlight control appearing
> >>> to be non-functional until after completing a suspend/resume cycle.
> >>> However, it is still possible to control the backlight through direct
> >>> interaction with the working GPU driver's backlight sysfs interface.
> >>>
> >>> These systems also appear to have a second firmware bug which resets
> >>> the EC's brightness level to 100% on resume, but leaves the state in
> >>> the kernel at the pre-suspend level. This causes attempts to save
> >>> and restore the backlight level across the suspend/resume cycle to
> >>> fail, due to the level appearing not to change even though it did.
> >>>
> >>> In order to work around these issues, add a quirk table to detect
> >>> systems that are known to show these behaviors. So far, there is
> >>> only one known system that requires these workarounds, and both
> >>> issues are present on that system, but the quirks are tracked
> >>> separately to make it easier to add them to other systems which
> >>> may exhibit one of the bugs, but not the other. The original systems
> >>> that this driver was tested on during development do not exhibit
> >>> either of these quirks.
> >>>
> >>> If a system with the "GPU driver has backlight control" quirk is
> >>> detected, nvidia-wmi-ec-backlight will grab a reference to the working
> >>> (when freshly booted) GPU backlight handler and relays any backlight
> >>> brightness level change requests directed at the EC to also be applied
> >>> to the GPU backlight interface. This leads to redundant updates
> >>> directed at the GPU backlight driver after a suspend/resume cycle, but
> >>> it does allow the EC backlight control to work when the system is
> >>> freshly booted.
> >>
> >> Ugh, I'm really not a fan of the backlight proxy plan here. I have
> >> plans to clean-up the whole x86 backlight mess soon and an important part
> >> of that is to stop registering multiple backlight interfaces for the
> >> same panel/screen.
> >>
> >> Where as going with this workaround requires us to have 2 active
> >> backlight interfaces active. Also this will very likely work to
> >> (subtly) different backlight behavior before and after the first
> >> suspend/resume.
> >
> > I understand. Having multiple backlight devices for the same panel is indeed annoying. Out of curiosity, what is the plan for determining that multiple backlight interfaces are all supposed to control the same panel?
>
> ATM the kernel basically only supports a bunch of different methods
> to control the backlight of 1 internal panel. The plan is to tie this
> to the panel from a userspace pov by making the brightness +
> max_brightness properties on the drm_connector object for the
> internal-panel.
>
> The in kernel tying of the backlight device to the internal panel
> will be done hardcoded inside the drm driver(s) based on the
> drivers already knowing which connector is the internal panel.
>
> This all naively assumes there is only 1 internal panel, which
> for the majority of cases is true. My plan for devices with
> 2 internal panels is to cross that bridge when we get there
> (I expect those mostly in phone/tablet like devices for now
> which will likely use devicetree where solving this is trivial).
>
> I do realize we will eventually get some x86/acpi device with
> 2 internal panels. Hopefully we can just figure out what
> the Windows drivers are doing there and parse e.g. the ACPI
> info which Windows is using for this.
>
> As part of the move to properties on the drm_connector object
> the /sys/class/backlight interface will become deprecated,
> but will be kept for backward compat and will eventually
> be put behind a Kconfig option.
>
> The kernel internal backlight_device stuff will be kept
> since we need some internal representation anyways and
> I don't see much value in reworking that, esp. since
> we need to have /sys/class/backlight backward compat.
>
> Note this is all based on discussions which I had with
> mainly Daniel Vetter @plumbers 2019 in Lisabon. I have
> never gotten around to actually start working on this,
> but this has resurfaced recently and I plan to actually
> take a stab at implementing this plan sometime during 2022.
>
> > I’m not sure I’m aware of any driver-agnostic ways of identifying a particular panel instance uniquely, and if there is one, that would actually help with something else I’m working on at the moment. If the idea involves e.g. the EDID, that could be troublesome for backlight drivers such as this one which aren’t associated with any GPU.
>
> Right, see above the main idea is to make this
> "the kernel's problem" and I expect us to fix this in
> the kernel in a variety of different ways depending on
> the actual hardware.
>
> As for "troublesome for backlight drivers such as this one
> which aren’t associated with any GPU.", the idea is that:
>
> 1. E.g the i915 driver (which I have the most experience with)
> knows which connector is the internal panel
>
> 2. The acpi_video_get_backlight_type() helper from
> drivers/acpi/video_detect.c will get extended to make sure
> that there is always only *1* /sys/class/backlight device.
>
> To be specific atm code supporting old vendor specific backlight
> fw interfaces, e.g. drivers/platform/x86/dell-laptop.c:
> already does:
>
>        if (acpi_video_get_backlight_type() != acpi_backlight_vendor)
>                 return 0;
>
> And drivers/acpi/acpi_video.c also already does:
>
>        if (acpi_video_get_backlight_type() != acpi_backlight_video)
>                 return 0;
>
> Currently looking at the 3 main x86 backlight interfaces: vendor,
> generic-ACPI and native-drm-driver, only the native driver's
> backlight registers unconditionally. The plan is to make those also
> do a similar check (*) and to also add special backlight drivers like
> nvidia-wmi-ec-backlight and drivers/video/backlight/apple_bl.c
> to this mechanism.
>
> 3. 1 + 2 means that the drm_driver can just tie the single
> backlight_device which will be registered on the system to
> the internal panel.
>
> Again I'm completely ignoring dual-internal-panel devices here
> for simplicity's sake.
>
> Note this is getting a bit off-topic, but if you have insights
> in this, or already can think of ways how this is not going to
> work :)  please let me know.
>
>
> *) And adding that check + the presence of nvidia-wmi-ec-backlight
> support will make the native drm-driver not register it's
> backlight_device at all at which point the backlight-proxy workaround
> from this patch breaks.
>
>
> > This also gives me an idea for another experiment I didn’t think to try earlier. Alex: what happens if you hack amdgpu_atombios_encoder_init_backlight() in the amdgpu driver to just return right away? I wonder if the AMD GPU’s attempt to take over backlight control is what makes the firmware give control to the GPU rather than the EC initially.
> >
> > Regardless of the backlight proxy workaround, does the force refresh one seem reasonable? That one at least addresses a condition that happens at every suspend/resume cycle.
>

Sorry for jumping in here, but I can't seem to find the original
thread with this comment.  amdgpu_atombios_encoder_init_backlight() is
not applicable to these systems.  That is the old pre-DC code path.
You want amdgpu_dm_register_backlight_device() for modern hardware.

Alex

> Good question, I must admit I stopped reading the patch after seeing
> the proxy thing.
>
> I see that you are using a pm_notifer for this. I wonder if you
> can try (on your own system) to add a pm_ops struct and make
> wmi_driver.driver.pm point to that and check if that gets called
> by adding e.g. a pr_info (I don't see why it would not get called).
>
> And assuming that works, using that would be a bit cleaner IMHO.
> Although that does have resume-ordering implications. But I would
> expect the EC to basically be always ready to get talked to at
> the point in the resume cycle where normal (non early) resume
> handlers are called.
>
> To be clear the idea would be to always have the suspend handler
> (so that the driver and pm_ops structs can be const) and to check
> a quirk flag inside the resume handler. Or maybe even just always
> read back the brightness from the hw and check if it has changed?
> Does this need to be behind a quirk ?
>
> >> Is there no other way to solve this issue? Maybe we need to poke
> >> vgaswitcheroo to set the current GPU mode even though this is
> >> already reported as active to get things to switch to the ECs
> >> control right away ?
> >
> > There isn’t a vgaswitcheroo handler for this particular mux device (yet), but there are separate ACPI interfaces for the mux itself. Poking the mux *shouldn’t* have any effect on what device is controlling the backlight for the panel, since when the system is in dynamic mux mode the EC is always supposed to be in control, but this system is already showing some weird behavior, so it doesn’t hurt to try.
>
> Right, as you said the EC is always supposed to be in control, but
> it is not. I would not be surprised if making the ACPI call to put
> things in dynamic mode (even though they already are) fixes this,
> assuming there is such an ACPI call...
>
> >> I'm pretty certain that Windows is not doing this backlight proxying,
> >> IMHO we need to figure out what causes the switch after suspend/resume
> >> and then do that thing at boot.
> >
> > I’ll put together an instrumented driver for Alex to try on his system, to capture some more data and see if I can get some more insight into that. I also have a dump of his ACPI tables, and can check there for some other potential leads. Hopefully whatever changes the state across the suspend/resume cycle is response to something that Linux does or doesn’t do, and not some state that is only handled internally within the firmware.
>
> Great, thank you.
>
> Regards,
>
> Hans
>
>
>
> >>> If a system with the "backlight level reset to full on resume" quirk
> >>> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
> >>> reset the backlight to the previous level upon resume.
> >>>
> >>> These workarounds are also plumbed through to kernel module parameters,
> >>> to make it easier for users who suspect they may be affected by one or
> >>> both of these bugs to test whether these workarounds are effective on
> >>> their systems as well.
> >>>
> >>> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
> >>> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
> >>> ---
> >>> Note: the Tested-by: line above applies to the previous version of this
> >>> patch; an explicit ACK from the tester is required for it to apply to
> >>> the current version.
> >>>
> >>> v2:
> >>> * Add readable sysfs files for module params, use linear interpolation
> >>>   from fixp-arith.h, fix return value of notifier callback, use devm_*()
> >>>   for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
> >>> * Add comment to denote known firmware versions that exhibit the bugs.
> >>>   (Mario Limonciello <Mario.Limonciello@amd.com>)
> >>> * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)
> >>>
> >>> .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
> >>> 1 file changed, 194 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> >>> index 61e37194df70..95e1ddf780fc 100644
> >>> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> >>> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> >>> @@ -3,8 +3,12 @@
> >>>  * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
> >>>  */
> >>>
> >>> +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
> >>> +
> >>> #include <linux/acpi.h>
> >>> #include <linux/backlight.h>
> >>> +#include <linux/dmi.h>
> >>> +#include <linux/fixp-arith.h>
> >>> #include <linux/mod_devicetable.h>
> >>> #include <linux/module.h>
> >>> #include <linux/types.h>
> >>> @@ -75,6 +79,73 @@ struct wmi_brightness_args {
> >>>    u32 ignored[3];
> >>> };
> >>>
> >>> +/**
> >>> + * struct nvidia_wmi_ec_backlight_priv - driver private data
> >>> + * @bl_dev:       the associated backlight device
> >>> + * @proxy_target: backlight device which receives relayed brightness changes
> >>> + * @notifier:     notifier block for resume callback
> >>> + */
> >>> +struct nvidia_wmi_ec_backlight_priv {
> >>> +    struct backlight_device *bl_dev;
> >>> +    struct backlight_device *proxy_target;
> >>> +    struct notifier_block nb;
> >>> +};
> >>> +
> >>> +static char *backlight_proxy_target;
> >>> +module_param(backlight_proxy_target, charp, 0444);
> >>> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
> >>> +
> >>> +static int max_reprobe_attempts = 128;
> >>> +module_param(max_reprobe_attempts, int, 0444);
> >>> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
> >>> +
> >>> +static bool restore_level_on_resume;
> >>> +module_param(restore_level_on_resume, bool, 0444);
> >>> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
> >>> +
> >>> +/* Bit field values for quirks table */
> >>> +
> >>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
> >>> +
> >>> +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
> >>> +
> >>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
> >>> +
> >>> +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
> >>> +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
> >>> +
> >>> +static int assign_quirks(const struct dmi_system_id *id)
> >>> +{
> >>> +    if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
> >>> +        restore_level_on_resume = 1;
> >>> +
> >>> +    /* If the module parameter is set, override the quirks table */
> >>> +    if (!backlight_proxy_target) {
> >>> +        if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
> >>> +            backlight_proxy_target = "amdgpu_bl1";
> >>> +    }
> >>> +
> >>> +    return true;
> >>> +}
> >>> +
> >>> +#define QUIRK_ENTRY(vendor, product, quirks) {          \
> >>> +    .callback = assign_quirks,                      \
> >>> +    .matches = {                                    \
> >>> +        DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
> >>> +        DMI_MATCH(DMI_PRODUCT_VERSION, product) \
> >>> +    },                                              \
> >>> +    .driver_data = (void *)(quirks)                 \
> >>> +}
> >>> +
> >>> +static const struct dmi_system_id quirks_table[] = {
> >>> +    QUIRK_ENTRY(
> >>> +        /* This quirk is preset as of firmware revision HACN31WW */
> >>> +        "LENOVO", "Legion S7 15ACH6",
> >>> +        QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
> >>> +    ),
> >>> +    { }
> >>> +};
> >>> +
> >>> /**
> >>>  * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
> >>>  * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
> >>> @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
> >>>    return 0;
> >>> }
> >>>
> >>> +/* Scale the current brightness level of 'from' to the range of 'to'. */
> >>> +static int scale_backlight_level(const struct backlight_device *from,
> >>> +                 const struct backlight_device *to)
> >>> +{
> >>> +    int from_max = from->props.max_brightness;
> >>> +    int from_level = from->props.brightness;
> >>> +    int to_max = to->props.max_brightness;
> >>> +
> >>> +    return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
> >>> +}
> >>> +
> >>> static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
> >>> {
> >>>    struct wmi_device *wdev = bl_get_data(bd);
> >>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> >>> +    struct backlight_device *proxy_target = priv->proxy_target;
> >>> +
> >>> +    if (proxy_target) {
> >>> +        int level = scale_backlight_level(bd, proxy_target);
> >>> +
> >>> +        if (backlight_device_set_brightness(proxy_target, level))
> >>> +            pr_warn("Failed to relay backlight update to \"%s\"",
> >>> +                backlight_proxy_target);
> >>> +    }
> >>>
> >>>    return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
> >>>                                 WMI_BRIGHTNESS_MODE_SET,
> >>> @@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
> >>>    .get_brightness = nvidia_wmi_ec_backlight_get_brightness,
> >>> };
> >>>
> >>> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
> >>> +{
> >>> +
> >>> +    /*
> >>> +     * On some systems, the EC backlight level gets reset to 100% when
> >>> +     * resuming from suspend, but the backlight device state still reflects
> >>> +     * the pre-suspend value. Refresh the existing state to sync the EC's
> >>> +     * state back up with the kernel's.
> >>> +     */
> >>> +    if (event == PM_POST_SUSPEND) {
> >>> +        struct nvidia_wmi_ec_backlight_priv *p;
> >>> +        int ret;
> >>> +
> >>> +        p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
> >>> +        ret = backlight_update_status(p->bl_dev);
> >>> +
> >>> +        if (ret)
> >>> +            pr_warn("failed to refresh backlight level: %d", ret);
> >>> +
> >>> +        return NOTIFY_OK;
> >>> +    }
> >>> +
> >>> +    return NOTIFY_DONE;
> >>> +}
> >>> +
> >>> +static void putdev(void *data)
> >>> +{
> >>> +    struct device *dev = data;
> >>> +
> >>> +    put_device(dev);
> >>> +}
> >>> +
> >>> static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
> >>> {
> >>> +    struct backlight_device *bdev, *target = NULL;
> >>> +    struct nvidia_wmi_ec_backlight_priv *priv;
> >>>    struct backlight_properties props = {};
> >>> -    struct backlight_device *bdev;
> >>>    u32 source;
> >>>    int ret;
> >>>
> >>> +    /*
> >>> +     * Check quirks tables to see if this system needs any of the firmware
> >>> +     * bug workarounds.
> >>> +     */
> >>> +    dmi_check_system(quirks_table);
> >>> +
> >>> +    if (backlight_proxy_target && backlight_proxy_target[0]) {
> >>> +        static int num_reprobe_attempts;
> >>> +
> >>> +        target = backlight_device_get_by_name(backlight_proxy_target);
> >>> +
> >>> +        if (target) {
> >>> +            ret = devm_add_action_or_reset(&wdev->dev, putdev,
> >>> +                               &target->dev);
> >>> +            if (ret)
> >>> +                return ret;
> >>> +        } else {
> >>> +            /*
> >>> +             * The target backlight device might not be ready;
> >>> +             * try again and disable backlight proxying if it
> >>> +             * fails too many times.
> >>> +             */
> >>> +            if (num_reprobe_attempts < max_reprobe_attempts) {
> >>> +                num_reprobe_attempts++;
> >>> +                return -EPROBE_DEFER;
> >>> +            }
> >>> +
> >>> +            pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
> >>> +                backlight_proxy_target, max_reprobe_attempts);
> >>> +        }
> >>> +    }
> >>> +
> >>>    ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
> >>>                               WMI_BRIGHTNESS_MODE_GET, &source);
> >>>    if (ret)
> >>> @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
> >>>                          &wdev->dev, wdev,
> >>>                          &nvidia_wmi_ec_backlight_ops,
> >>>                          &props);
> >>> -    return PTR_ERR_OR_ZERO(bdev);
> >>> +
> >>> +    if (IS_ERR(bdev))
> >>> +        return PTR_ERR(bdev);
> >>> +
> >>> +    priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
> >>> +    if (!priv)
> >>> +        return -ENOMEM;
> >>> +
> >>> +    priv->bl_dev = bdev;
> >>> +
> >>> +    dev_set_drvdata(&wdev->dev, priv);
> >>> +
> >>> +    if (target) {
> >>> +        int level = scale_backlight_level(target, bdev);
> >>> +
> >>> +        if (backlight_device_set_brightness(bdev, level))
> >>> +            pr_warn("Unable to import initial brightness level from %s.",
> >>> +                backlight_proxy_target);
> >>> +        priv->proxy_target = target;
> >>> +    }
> >>> +
> >>> +    if (restore_level_on_resume) {
> >>> +        priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
> >>> +        register_pm_notifier(&priv->nb);
> >>> +    }
> >>> +
> >>> +    return 0;
> >>> +}
> >>> +
> >>> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
> >>> +{
> >>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> >>> +
> >>> +    if (priv->nb.notifier_call)
> >>> +        unregister_pm_notifier(&priv->nb);
> >>> }
> >>>
> >>> #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
> >>> @@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
> >>>        .name = "nvidia-wmi-ec-backlight",
> >>>    },
> >>>    .probe = nvidia_wmi_ec_backlight_probe,
> >>> +    .remove = nvidia_wmi_ec_backlight_remove,
> >>>    .id_table = nvidia_wmi_ec_backlight_id_table,
> >>> };
> >>> module_wmi_driver(nvidia_wmi_ec_backlight_driver);
> >>
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-17 16:42             ` Hans de Goede
@ 2022-03-17 18:36               ` Daniel Dadap
  -1 siblings, 0 replies; 31+ messages in thread
From: Daniel Dadap @ 2022-03-17 18:36 UTC (permalink / raw)
  To: Hans de Goede
  Cc: platform-driver-x86, pobrn, markgross, Mario.Limonciello,
	Alexandru Dinu, dri-devel, Daniel Vetter


On 3/17/22 11:42, Hans de Goede wrote:
> Hi Daniel,
>
> On 3/17/22 14:28, Daniel Dadap wrote:
>>> On Mar 17, 2022, at 07:17, Hans de Goede <hdegoede@redhat.com> wrote:
>>>
>>> Hi,
>>>
>>>> On 3/16/22 21:33, Daniel Dadap wrote:
>>>> Some notebook systems with EC-driven backlight control appear to have a
>>>> firmware bug which causes the system to use GPU-driven backlight control
>>>> upon a fresh boot, but then switches to EC-driven backlight control
>>>> after completing a suspend/resume cycle. All the while, the firmware
>>>> reports that the backlight is under EC control, regardless of what is
>>>> actually controlling the backlight brightness.
>>>>
>>>> This leads to the following behavior:
>>>>
>>>> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>>>>   WMI-wrapped ACPI method erroneously reporting EC control.
>>>> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>>>>   cycle, due to the backlight control actually being GPU-driven.
>>>> * GPU drivers also register their own backlight handlers: in the case
>>>>   of the notebook system where this behavior has been observed, both
>>>>   amdgpu and the NVIDIA proprietary driver register backlight handlers.
>>>> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>>>>   case observed so far) can successfully control the backlight through
>>>>   its backlight driver's sysfs interface, but stops working after the
>>>>   first suspend/resume cycle.
>>>> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>>>>   fresh boot, but begins to work after the first suspend/resume cycle.
>>>> * The GPU which does not have backlight control (NVIDIA in this case)
>>>>   is not able to control the backlight at any point while the system
>>>>   is in operation. On similar hybrid systems with an EC-controlled
>>>>   backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>>>>   does not register its backlight handler. It has not been determined
>>>>   whether the non-functional handler registered by the NVIDIA driver
>>>>   is due to another firmware bug, or a bug in the NVIDIA driver.
>>>>
>>>> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
>>>> device, it takes precedence over the BACKLIGHT_RAW devices registered
>>>> by the GPU drivers. This in turn leads to backlight control appearing
>>>> to be non-functional until after completing a suspend/resume cycle.
>>>> However, it is still possible to control the backlight through direct
>>>> interaction with the working GPU driver's backlight sysfs interface.
>>>>
>>>> These systems also appear to have a second firmware bug which resets
>>>> the EC's brightness level to 100% on resume, but leaves the state in
>>>> the kernel at the pre-suspend level. This causes attempts to save
>>>> and restore the backlight level across the suspend/resume cycle to
>>>> fail, due to the level appearing not to change even though it did.
>>>>
>>>> In order to work around these issues, add a quirk table to detect
>>>> systems that are known to show these behaviors. So far, there is
>>>> only one known system that requires these workarounds, and both
>>>> issues are present on that system, but the quirks are tracked
>>>> separately to make it easier to add them to other systems which
>>>> may exhibit one of the bugs, but not the other. The original systems
>>>> that this driver was tested on during development do not exhibit
>>>> either of these quirks.
>>>>
>>>> If a system with the "GPU driver has backlight control" quirk is
>>>> detected, nvidia-wmi-ec-backlight will grab a reference to the working
>>>> (when freshly booted) GPU backlight handler and relays any backlight
>>>> brightness level change requests directed at the EC to also be applied
>>>> to the GPU backlight interface. This leads to redundant updates
>>>> directed at the GPU backlight driver after a suspend/resume cycle, but
>>>> it does allow the EC backlight control to work when the system is
>>>> freshly booted.
>>> Ugh, I'm really not a fan of the backlight proxy plan here. I have
>>> plans to clean-up the whole x86 backlight mess soon and an important part
>>> of that is to stop registering multiple backlight interfaces for the
>>> same panel/screen.
>>>
>>> Where as going with this workaround requires us to have 2 active
>>> backlight interfaces active. Also this will very likely work to
>>> (subtly) different backlight behavior before and after the first
>>> suspend/resume.
>> I understand. Having multiple backlight devices for the same panel is indeed annoying. Out of curiosity, what is the plan for determining that multiple backlight interfaces are all supposed to control the same panel?
> ATM the kernel basically only supports a bunch of different methods
> to control the backlight of 1 internal panel. The plan is to tie this
> to the panel from a userspace pov by making the brightness +
> max_brightness properties on the drm_connector object for the
> internal-panel.
>
> The in kernel tying of the backlight device to the internal panel
> will be done hardcoded inside the drm driver(s) based on the
> drivers already knowing which connector is the internal panel.


Okay. At the moment the other problem I am thinking about also makes a 
one-internal-panel assumption, and it's true for the particular hardware 
that I'm working with, but I didn't like that assumption. If there isn't 
something existing that can be used to link connectors from multiple 
GPUs together to indicate they actually (potentially) drive the same 
display panel, then I suppose I'll continue assuming that. But I really 
do think the >1 display case needs to be more than just an afterthought, 
even if the solution, for now, is to just have all the drivers agree 
that a single internal panel is e.g. index 0 of some shared table that 
the drivers can all consult so that everybody knows which panel is which.

I am also concerned about muxed designs that don't have an EC-controlled 
backlight. If there's only allowed to be one backlight device per panel, 
then that means vga_switcheroo would have to do something to disable the 
outgoing GPU's backlight control and enable the incoming one's. And then 
all the userspace software that controls the sysfs backlight interfaces 
would need to make sure to check to see what backlight devices are 
exposed every time a brightness change request is made, and not e.g. 
just once at init and then assume that the backlight devices won't 
change over the lifetime of whatever backlight manager software we're 
talking about (e.g. gnome-settings-daemon). I suppose if what the kernel 
exposes is an abstraction so that userspace only sees one backlight 
interface per panel in sysfs at any given time, which might actually be 
connected to one of several different drivers in the backlight 
subsystem, that would make it a little better, but I still think there 
would be potential for races between a mux switch and a brightness 
change event.

> This all naively assumes there is only 1 internal panel, which
> for the majority of cases is true. My plan for devices with
> 2 internal panels is to cross that bridge when we get there
> (I expect those mostly in phone/tablet like devices for now
> which will likely use devicetree where solving this is trivial).
>
> I do realize we will eventually get some x86/acpi device with
> 2 internal panels. Hopefully we can just figure out what
> the Windows drivers are doing there and parse e.g. the ACPI
> info which Windows is using for this.


I'm not aware of any more modern examples, but there already have been 
such systems. See e.g. Lenovo ThinkPad W701ds. IIRC that system was 
discrete GPU only, so there wouldn't be any concern about coordinating 
panel IDs between different GPU drivers, but there is a very real 
possibility that a vendor may want to bring back a similar design with 
hybrid GPUs. The asymmetry of that system always bothered me a bit: I 
imagine that if the NVIDIA GPUs of that era supported more than two 
heads per GPU, we may have seen a three-screen notebook.

It's very possible that some of the modern exotic e.g. folding designs 
show up as dual displays, but I don't have any first-hand experience 
with those to know whether that is the case. I definitely remember the 
W701ds exposed the two displays as separate entities. (And the one that 
popped off to the side was rotated, which was fun.)


> As part of the move to properties on the drm_connector object
> the /sys/class/backlight interface will become deprecated,
> but will be kept for backward compat and will eventually
> be put behind a Kconfig option.


If you have a high-level design for this written down somewhere, do you 
mind sharing it? I want to get an idea of what types of changes the 
proprietary nvidia-drm driver might need to participate in this.


> The kernel internal backlight_device stuff will be kept
> since we need some internal representation anyways and
> I don't see much value in reworking that, esp. since
> we need to have /sys/class/backlight backward compat.
>
> Note this is all based on discussions which I had with
> mainly Daniel Vetter @plumbers 2019 in Lisabon. I have
> never gotten around to actually start working on this,
> but this has resurfaced recently and I plan to actually
> take a stab at implementing this plan sometime during 2022.
>
>> I’m not sure I’m aware of any driver-agnostic ways of identifying a particular panel instance uniquely, and if there is one, that would actually help with something else I’m working on at the moment. If the idea involves e.g. the EDID, that could be troublesome for backlight drivers such as this one which aren’t associated with any GPU.
> Right, see above the main idea is to make this
> "the kernel's problem" and I expect us to fix this in
> the kernel in a variety of different ways depending on
> the actual hardware.
>
> As for "troublesome for backlight drivers such as this one
> which aren’t associated with any GPU.", the idea is that:
>
> 1. E.g the i915 driver (which I have the most experience with)
> knows which connector is the internal panel


Sure, the NVIDIA proprietary driver knows this as well, and what I've 
seen from the other DRM drivers also suggests that this is an easy 
determination to make.


> 2. The acpi_video_get_backlight_type() helper from
> drivers/acpi/video_detect.c will get extended to make sure
> that there is always only *1* /sys/class/backlight device.
>
> To be specific atm code supporting old vendor specific backlight
> fw interfaces, e.g. drivers/platform/x86/dell-laptop.c:
> already does:
>
>         if (acpi_video_get_backlight_type() != acpi_backlight_vendor)
>                  return 0;
>
> And drivers/acpi/acpi_video.c also already does:
>
>         if (acpi_video_get_backlight_type() != acpi_backlight_video)
>                  return 0;
>
> Currently looking at the 3 main x86 backlight interfaces: vendor,
> generic-ACPI and native-drm-driver, only the native driver's
> backlight registers unconditionally. The plan is to make those also
> do a similar check (*) and to also add special backlight drivers like
> nvidia-wmi-ec-backlight and drivers/video/backlight/apple_bl.c
> to this mechanism.


nvidia-wmi-ec-backlight does this check during probe:

>         ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>                                    WMI_BRIGHTNESS_MODE_GET, &source);
>         if (ret)
>                 return ret;
>
>         /*
>          * This driver is only to be used when brightness control is 
> handled
>          * by the EC; otherwise, the GPU driver(s) should control 
> brightness.
>          */
>         if (source != WMI_BRIGHTNESS_SOURCE_EC)
>                 return -ENODEV;


And part of the problem with this bug is that what the firmware is 
reporting here is a lie, at least until the first suspend/resume cycle.

Also, unlike the DRM drivers, the NVIDIA proprietary driver does not 
register its backlight handler unconditionally: one of the strange 
things about this system is that the NVIDIA proprietary driver *does* 
register a backlight handler, even though it's definitely not supposed 
to in this case. I did notice that the DRM drivers don't currently seem 
to have any checks to see whether they should register, and indeed on 
the EC backlight systems I have tested, with both amdgpu and i915, there 
is a non-functional sysfs backlight interface exposed by the iGPU. One 
of the other strange things about the system is the fact that the amdgpu 
backlight driver works at all.

> 3. 1 + 2 means that the drm_driver can just tie the single
> backlight_device which will be registered on the system to
> the internal panel.

This sounds kind of ugly to me. If the backlight control is 
GPU-agnostic, as is the case here, then associating the backlight_device 
with a drm_driver doesn't seem right. It sounds like the plan is to move 
from the current sysfs interface to one that's exposed by the DRM 
subsystem. If that's the case, I think there should be a way to do so 
without tying it to a specific GPU driver. e.g., on a muxed system, one 
or the other GPU might be scanning out at any given time (which is why 
many of these systems have non-GPU backlight control), so if the 
interface is GPU-specific, that would be at the very least confusing.


> Again I'm completely ignoring dual-internal-panel devices here
> for simplicity's sake.
>
> Note this is getting a bit off-topic, but if you have insights
> in this, or already can think of ways how this is not going to
> work :)  please let me know.


My concerns listed above are everything I can think of off the top of my 
head. I'm sure if I think about it more I'll come up with other 
concerns. This did get a bit off-topic, but I'm actually very glad 
you've told me all this.


>
> *) And adding that check + the presence of nvidia-wmi-ec-backlight
> support will make the native drm-driver not register it's
> backlight_device at all at which point the backlight-proxy workaround
> from this patch breaks.
>
>
>> This also gives me an idea for another experiment I didn’t think to try earlier. Alex: what happens if you hack amdgpu_atombios_encoder_init_backlight() in the amdgpu driver to just return right away? I wonder if the AMD GPU’s attempt to take over backlight control is what makes the firmware give control to the GPU rather than the EC initially.
>>
>> Regardless of the backlight proxy workaround, does the force refresh one seem reasonable? That one at least addresses a condition that happens at every suspend/resume cycle.
> Good question, I must admit I stopped reading the patch after seeing
> the proxy thing.
>
> I see that you are using a pm_notifer for this. I wonder if you
> can try (on your own system) to add a pm_ops struct and make
> wmi_driver.driver.pm point to that and check if that gets called
> by adding e.g. a pr_info (I don't see why it would not get called).
>
> And assuming that works, using that would be a bit cleaner IMHO.
> Although that does have resume-ordering implications. But I would
> expect the EC to basically be always ready to get talked to at
> the point in the resume cycle where normal (non early) resume
> handlers are called.
>
> To be clear the idea would be to always have the suspend handler
> (so that the driver and pm_ops structs can be const) and to check
> a quirk flag inside the resume handler. Or maybe even just always
> read back the brightness from the hw and check if it has changed?
> Does this need to be behind a quirk ?


Okay, thanks. I missed the wmi_driver.driver.pm when I was looking to 
wire up something to refresh the backlight level on resume; that does 
sound cleaner than registering a notifier. Alex did report that the 
backlight was briefly at maximum brightness following resume, before the 
notifier kicked in and reset it to the correct level, so hopefully 
hooking it up through pm_ops will be able to catch it early enough to 
avoid that. And making the refresh unconditional rather than hidden 
behind a quirk sounds reasonable too; I expect it to be harmless on 
systems that don't need it.


>>> Is there no other way to solve this issue? Maybe we need to poke
>>> vgaswitcheroo to set the current GPU mode even though this is
>>> already reported as active to get things to switch to the ECs
>>> control right away ?
>> There isn’t a vgaswitcheroo handler for this particular mux device (yet), but there are separate ACPI interfaces for the mux itself. Poking the mux *shouldn’t* have any effect on what device is controlling the backlight for the panel, since when the system is in dynamic mux mode the EC is always supposed to be in control, but this system is already showing some weird behavior, so it doesn’t hurt to try.
> Right, as you said the EC is always supposed to be in control, but
> it is not. I would not be surprised if making the ACPI call to put
> things in dynamic mode (even though they already are) fixes this,
> assuming there is such an ACPI call...


Yes, there is one, although changes to the mode aren't supposed to take 
effect until reboot. However, this system already seems to be 
misbehaving, so it's certainly possible that poking at that call will do 
something.


>>> I'm pretty certain that Windows is not doing this backlight proxying,
>>> IMHO we need to figure out what causes the switch after suspend/resume
>>> and then do that thing at boot.
>> I’ll put together an instrumented driver for Alex to try on his system, to capture some more data and see if I can get some more insight into that. I also have a dump of his ACPI tables, and can check there for some other potential leads. Hopefully whatever changes the state across the suspend/resume cycle is response to something that Linux does or doesn’t do, and not some state that is only handled internally within the firmware.
> Great, thank you.
>
> Regards,
>
> Hans
>
>
>
>>>> If a system with the "backlight level reset to full on resume" quirk
>>>> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
>>>> reset the backlight to the previous level upon resume.
>>>>
>>>> These workarounds are also plumbed through to kernel module parameters,
>>>> to make it easier for users who suspect they may be affected by one or
>>>> both of these bugs to test whether these workarounds are effective on
>>>> their systems as well.
>>>>
>>>> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
>>>> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
>>>> ---
>>>> Note: the Tested-by: line above applies to the previous version of this
>>>> patch; an explicit ACK from the tester is required for it to apply to
>>>> the current version.
>>>>
>>>> v2:
>>>> * Add readable sysfs files for module params, use linear interpolation
>>>>    from fixp-arith.h, fix return value of notifier callback, use devm_*()
>>>>    for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
>>>> * Add comment to denote known firmware versions that exhibit the bugs.
>>>>    (Mario Limonciello <Mario.Limonciello@amd.com>)
>>>> * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)
>>>>
>>>> .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
>>>> 1 file changed, 194 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>>> index 61e37194df70..95e1ddf780fc 100644
>>>> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>>> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>>> @@ -3,8 +3,12 @@
>>>>   * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
>>>>   */
>>>>
>>>> +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
>>>> +
>>>> #include <linux/acpi.h>
>>>> #include <linux/backlight.h>
>>>> +#include <linux/dmi.h>
>>>> +#include <linux/fixp-arith.h>
>>>> #include <linux/mod_devicetable.h>
>>>> #include <linux/module.h>
>>>> #include <linux/types.h>
>>>> @@ -75,6 +79,73 @@ struct wmi_brightness_args {
>>>>     u32 ignored[3];
>>>> };
>>>>
>>>> +/**
>>>> + * struct nvidia_wmi_ec_backlight_priv - driver private data
>>>> + * @bl_dev:       the associated backlight device
>>>> + * @proxy_target: backlight device which receives relayed brightness changes
>>>> + * @notifier:     notifier block for resume callback
>>>> + */
>>>> +struct nvidia_wmi_ec_backlight_priv {
>>>> +    struct backlight_device *bl_dev;
>>>> +    struct backlight_device *proxy_target;
>>>> +    struct notifier_block nb;
>>>> +};
>>>> +
>>>> +static char *backlight_proxy_target;
>>>> +module_param(backlight_proxy_target, charp, 0444);
>>>> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
>>>> +
>>>> +static int max_reprobe_attempts = 128;
>>>> +module_param(max_reprobe_attempts, int, 0444);
>>>> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
>>>> +
>>>> +static bool restore_level_on_resume;
>>>> +module_param(restore_level_on_resume, bool, 0444);
>>>> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
>>>> +
>>>> +/* Bit field values for quirks table */
>>>> +
>>>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
>>>> +
>>>> +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
>>>> +
>>>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
>>>> +
>>>> +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
>>>> +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
>>>> +
>>>> +static int assign_quirks(const struct dmi_system_id *id)
>>>> +{
>>>> +    if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
>>>> +        restore_level_on_resume = 1;
>>>> +
>>>> +    /* If the module parameter is set, override the quirks table */
>>>> +    if (!backlight_proxy_target) {
>>>> +        if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
>>>> +            backlight_proxy_target = "amdgpu_bl1";
>>>> +    }
>>>> +
>>>> +    return true;
>>>> +}
>>>> +
>>>> +#define QUIRK_ENTRY(vendor, product, quirks) {          \
>>>> +    .callback = assign_quirks,                      \
>>>> +    .matches = {                                    \
>>>> +        DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
>>>> +        DMI_MATCH(DMI_PRODUCT_VERSION, product) \
>>>> +    },                                              \
>>>> +    .driver_data = (void *)(quirks)                 \
>>>> +}
>>>> +
>>>> +static const struct dmi_system_id quirks_table[] = {
>>>> +    QUIRK_ENTRY(
>>>> +        /* This quirk is preset as of firmware revision HACN31WW */
>>>> +        "LENOVO", "Legion S7 15ACH6",
>>>> +        QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
>>>> +    ),
>>>> +    { }
>>>> +};
>>>> +
>>>> /**
>>>>   * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
>>>>   * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
>>>> @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
>>>>     return 0;
>>>> }
>>>>
>>>> +/* Scale the current brightness level of 'from' to the range of 'to'. */
>>>> +static int scale_backlight_level(const struct backlight_device *from,
>>>> +                 const struct backlight_device *to)
>>>> +{
>>>> +    int from_max = from->props.max_brightness;
>>>> +    int from_level = from->props.brightness;
>>>> +    int to_max = to->props.max_brightness;
>>>> +
>>>> +    return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
>>>> +}
>>>> +
>>>> static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
>>>> {
>>>>     struct wmi_device *wdev = bl_get_data(bd);
>>>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>>>> +    struct backlight_device *proxy_target = priv->proxy_target;
>>>> +
>>>> +    if (proxy_target) {
>>>> +        int level = scale_backlight_level(bd, proxy_target);
>>>> +
>>>> +        if (backlight_device_set_brightness(proxy_target, level))
>>>> +            pr_warn("Failed to relay backlight update to \"%s\"",
>>>> +                backlight_proxy_target);
>>>> +    }
>>>>
>>>>     return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
>>>>                                  WMI_BRIGHTNESS_MODE_SET,
>>>> @@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
>>>>     .get_brightness = nvidia_wmi_ec_backlight_get_brightness,
>>>> };
>>>>
>>>> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
>>>> +{
>>>> +
>>>> +    /*
>>>> +     * On some systems, the EC backlight level gets reset to 100% when
>>>> +     * resuming from suspend, but the backlight device state still reflects
>>>> +     * the pre-suspend value. Refresh the existing state to sync the EC's
>>>> +     * state back up with the kernel's.
>>>> +     */
>>>> +    if (event == PM_POST_SUSPEND) {
>>>> +        struct nvidia_wmi_ec_backlight_priv *p;
>>>> +        int ret;
>>>> +
>>>> +        p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
>>>> +        ret = backlight_update_status(p->bl_dev);
>>>> +
>>>> +        if (ret)
>>>> +            pr_warn("failed to refresh backlight level: %d", ret);
>>>> +
>>>> +        return NOTIFY_OK;
>>>> +    }
>>>> +
>>>> +    return NOTIFY_DONE;
>>>> +}
>>>> +
>>>> +static void putdev(void *data)
>>>> +{
>>>> +    struct device *dev = data;
>>>> +
>>>> +    put_device(dev);
>>>> +}
>>>> +
>>>> static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
>>>> {
>>>> +    struct backlight_device *bdev, *target = NULL;
>>>> +    struct nvidia_wmi_ec_backlight_priv *priv;
>>>>     struct backlight_properties props = {};
>>>> -    struct backlight_device *bdev;
>>>>     u32 source;
>>>>     int ret;
>>>>
>>>> +    /*
>>>> +     * Check quirks tables to see if this system needs any of the firmware
>>>> +     * bug workarounds.
>>>> +     */
>>>> +    dmi_check_system(quirks_table);
>>>> +
>>>> +    if (backlight_proxy_target && backlight_proxy_target[0]) {
>>>> +        static int num_reprobe_attempts;
>>>> +
>>>> +        target = backlight_device_get_by_name(backlight_proxy_target);
>>>> +
>>>> +        if (target) {
>>>> +            ret = devm_add_action_or_reset(&wdev->dev, putdev,
>>>> +                               &target->dev);
>>>> +            if (ret)
>>>> +                return ret;
>>>> +        } else {
>>>> +            /*
>>>> +             * The target backlight device might not be ready;
>>>> +             * try again and disable backlight proxying if it
>>>> +             * fails too many times.
>>>> +             */
>>>> +            if (num_reprobe_attempts < max_reprobe_attempts) {
>>>> +                num_reprobe_attempts++;
>>>> +                return -EPROBE_DEFER;
>>>> +            }
>>>> +
>>>> +            pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
>>>> +                backlight_proxy_target, max_reprobe_attempts);
>>>> +        }
>>>> +    }
>>>> +
>>>>     ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>>>>                                WMI_BRIGHTNESS_MODE_GET, &source);
>>>>     if (ret)
>>>> @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
>>>>                           &wdev->dev, wdev,
>>>>                           &nvidia_wmi_ec_backlight_ops,
>>>>                           &props);
>>>> -    return PTR_ERR_OR_ZERO(bdev);
>>>> +
>>>> +    if (IS_ERR(bdev))
>>>> +        return PTR_ERR(bdev);
>>>> +
>>>> +    priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
>>>> +    if (!priv)
>>>> +        return -ENOMEM;
>>>> +
>>>> +    priv->bl_dev = bdev;
>>>> +
>>>> +    dev_set_drvdata(&wdev->dev, priv);
>>>> +
>>>> +    if (target) {
>>>> +        int level = scale_backlight_level(target, bdev);
>>>> +
>>>> +        if (backlight_device_set_brightness(bdev, level))
>>>> +            pr_warn("Unable to import initial brightness level from %s.",
>>>> +                backlight_proxy_target);
>>>> +        priv->proxy_target = target;
>>>> +    }
>>>> +
>>>> +    if (restore_level_on_resume) {
>>>> +        priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
>>>> +        register_pm_notifier(&priv->nb);
>>>> +    }
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
>>>> +{
>>>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>>>> +
>>>> +    if (priv->nb.notifier_call)
>>>> +        unregister_pm_notifier(&priv->nb);
>>>> }
>>>>
>>>> #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
>>>> @@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
>>>>         .name = "nvidia-wmi-ec-backlight",
>>>>     },
>>>>     .probe = nvidia_wmi_ec_backlight_probe,
>>>> +    .remove = nvidia_wmi_ec_backlight_remove,
>>>>     .id_table = nvidia_wmi_ec_backlight_id_table,
>>>> };
>>>> module_wmi_driver(nvidia_wmi_ec_backlight_driver);

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
@ 2022-03-17 18:36               ` Daniel Dadap
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel Dadap @ 2022-03-17 18:36 UTC (permalink / raw)
  To: Hans de Goede
  Cc: dri-devel, platform-driver-x86, markgross, pobrn, Alexandru Dinu,
	Mario.Limonciello


On 3/17/22 11:42, Hans de Goede wrote:
> Hi Daniel,
>
> On 3/17/22 14:28, Daniel Dadap wrote:
>>> On Mar 17, 2022, at 07:17, Hans de Goede <hdegoede@redhat.com> wrote:
>>>
>>> Hi,
>>>
>>>> On 3/16/22 21:33, Daniel Dadap wrote:
>>>> Some notebook systems with EC-driven backlight control appear to have a
>>>> firmware bug which causes the system to use GPU-driven backlight control
>>>> upon a fresh boot, but then switches to EC-driven backlight control
>>>> after completing a suspend/resume cycle. All the while, the firmware
>>>> reports that the backlight is under EC control, regardless of what is
>>>> actually controlling the backlight brightness.
>>>>
>>>> This leads to the following behavior:
>>>>
>>>> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>>>>   WMI-wrapped ACPI method erroneously reporting EC control.
>>>> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>>>>   cycle, due to the backlight control actually being GPU-driven.
>>>> * GPU drivers also register their own backlight handlers: in the case
>>>>   of the notebook system where this behavior has been observed, both
>>>>   amdgpu and the NVIDIA proprietary driver register backlight handlers.
>>>> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>>>>   case observed so far) can successfully control the backlight through
>>>>   its backlight driver's sysfs interface, but stops working after the
>>>>   first suspend/resume cycle.
>>>> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>>>>   fresh boot, but begins to work after the first suspend/resume cycle.
>>>> * The GPU which does not have backlight control (NVIDIA in this case)
>>>>   is not able to control the backlight at any point while the system
>>>>   is in operation. On similar hybrid systems with an EC-controlled
>>>>   backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>>>>   does not register its backlight handler. It has not been determined
>>>>   whether the non-functional handler registered by the NVIDIA driver
>>>>   is due to another firmware bug, or a bug in the NVIDIA driver.
>>>>
>>>> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
>>>> device, it takes precedence over the BACKLIGHT_RAW devices registered
>>>> by the GPU drivers. This in turn leads to backlight control appearing
>>>> to be non-functional until after completing a suspend/resume cycle.
>>>> However, it is still possible to control the backlight through direct
>>>> interaction with the working GPU driver's backlight sysfs interface.
>>>>
>>>> These systems also appear to have a second firmware bug which resets
>>>> the EC's brightness level to 100% on resume, but leaves the state in
>>>> the kernel at the pre-suspend level. This causes attempts to save
>>>> and restore the backlight level across the suspend/resume cycle to
>>>> fail, due to the level appearing not to change even though it did.
>>>>
>>>> In order to work around these issues, add a quirk table to detect
>>>> systems that are known to show these behaviors. So far, there is
>>>> only one known system that requires these workarounds, and both
>>>> issues are present on that system, but the quirks are tracked
>>>> separately to make it easier to add them to other systems which
>>>> may exhibit one of the bugs, but not the other. The original systems
>>>> that this driver was tested on during development do not exhibit
>>>> either of these quirks.
>>>>
>>>> If a system with the "GPU driver has backlight control" quirk is
>>>> detected, nvidia-wmi-ec-backlight will grab a reference to the working
>>>> (when freshly booted) GPU backlight handler and relays any backlight
>>>> brightness level change requests directed at the EC to also be applied
>>>> to the GPU backlight interface. This leads to redundant updates
>>>> directed at the GPU backlight driver after a suspend/resume cycle, but
>>>> it does allow the EC backlight control to work when the system is
>>>> freshly booted.
>>> Ugh, I'm really not a fan of the backlight proxy plan here. I have
>>> plans to clean-up the whole x86 backlight mess soon and an important part
>>> of that is to stop registering multiple backlight interfaces for the
>>> same panel/screen.
>>>
>>> Where as going with this workaround requires us to have 2 active
>>> backlight interfaces active. Also this will very likely work to
>>> (subtly) different backlight behavior before and after the first
>>> suspend/resume.
>> I understand. Having multiple backlight devices for the same panel is indeed annoying. Out of curiosity, what is the plan for determining that multiple backlight interfaces are all supposed to control the same panel?
> ATM the kernel basically only supports a bunch of different methods
> to control the backlight of 1 internal panel. The plan is to tie this
> to the panel from a userspace pov by making the brightness +
> max_brightness properties on the drm_connector object for the
> internal-panel.
>
> The in kernel tying of the backlight device to the internal panel
> will be done hardcoded inside the drm driver(s) based on the
> drivers already knowing which connector is the internal panel.


Okay. At the moment the other problem I am thinking about also makes a 
one-internal-panel assumption, and it's true for the particular hardware 
that I'm working with, but I didn't like that assumption. If there isn't 
something existing that can be used to link connectors from multiple 
GPUs together to indicate they actually (potentially) drive the same 
display panel, then I suppose I'll continue assuming that. But I really 
do think the >1 display case needs to be more than just an afterthought, 
even if the solution, for now, is to just have all the drivers agree 
that a single internal panel is e.g. index 0 of some shared table that 
the drivers can all consult so that everybody knows which panel is which.

I am also concerned about muxed designs that don't have an EC-controlled 
backlight. If there's only allowed to be one backlight device per panel, 
then that means vga_switcheroo would have to do something to disable the 
outgoing GPU's backlight control and enable the incoming one's. And then 
all the userspace software that controls the sysfs backlight interfaces 
would need to make sure to check to see what backlight devices are 
exposed every time a brightness change request is made, and not e.g. 
just once at init and then assume that the backlight devices won't 
change over the lifetime of whatever backlight manager software we're 
talking about (e.g. gnome-settings-daemon). I suppose if what the kernel 
exposes is an abstraction so that userspace only sees one backlight 
interface per panel in sysfs at any given time, which might actually be 
connected to one of several different drivers in the backlight 
subsystem, that would make it a little better, but I still think there 
would be potential for races between a mux switch and a brightness 
change event.

> This all naively assumes there is only 1 internal panel, which
> for the majority of cases is true. My plan for devices with
> 2 internal panels is to cross that bridge when we get there
> (I expect those mostly in phone/tablet like devices for now
> which will likely use devicetree where solving this is trivial).
>
> I do realize we will eventually get some x86/acpi device with
> 2 internal panels. Hopefully we can just figure out what
> the Windows drivers are doing there and parse e.g. the ACPI
> info which Windows is using for this.


I'm not aware of any more modern examples, but there already have been 
such systems. See e.g. Lenovo ThinkPad W701ds. IIRC that system was 
discrete GPU only, so there wouldn't be any concern about coordinating 
panel IDs between different GPU drivers, but there is a very real 
possibility that a vendor may want to bring back a similar design with 
hybrid GPUs. The asymmetry of that system always bothered me a bit: I 
imagine that if the NVIDIA GPUs of that era supported more than two 
heads per GPU, we may have seen a three-screen notebook.

It's very possible that some of the modern exotic e.g. folding designs 
show up as dual displays, but I don't have any first-hand experience 
with those to know whether that is the case. I definitely remember the 
W701ds exposed the two displays as separate entities. (And the one that 
popped off to the side was rotated, which was fun.)


> As part of the move to properties on the drm_connector object
> the /sys/class/backlight interface will become deprecated,
> but will be kept for backward compat and will eventually
> be put behind a Kconfig option.


If you have a high-level design for this written down somewhere, do you 
mind sharing it? I want to get an idea of what types of changes the 
proprietary nvidia-drm driver might need to participate in this.


> The kernel internal backlight_device stuff will be kept
> since we need some internal representation anyways and
> I don't see much value in reworking that, esp. since
> we need to have /sys/class/backlight backward compat.
>
> Note this is all based on discussions which I had with
> mainly Daniel Vetter @plumbers 2019 in Lisabon. I have
> never gotten around to actually start working on this,
> but this has resurfaced recently and I plan to actually
> take a stab at implementing this plan sometime during 2022.
>
>> I’m not sure I’m aware of any driver-agnostic ways of identifying a particular panel instance uniquely, and if there is one, that would actually help with something else I’m working on at the moment. If the idea involves e.g. the EDID, that could be troublesome for backlight drivers such as this one which aren’t associated with any GPU.
> Right, see above the main idea is to make this
> "the kernel's problem" and I expect us to fix this in
> the kernel in a variety of different ways depending on
> the actual hardware.
>
> As for "troublesome for backlight drivers such as this one
> which aren’t associated with any GPU.", the idea is that:
>
> 1. E.g the i915 driver (which I have the most experience with)
> knows which connector is the internal panel


Sure, the NVIDIA proprietary driver knows this as well, and what I've 
seen from the other DRM drivers also suggests that this is an easy 
determination to make.


> 2. The acpi_video_get_backlight_type() helper from
> drivers/acpi/video_detect.c will get extended to make sure
> that there is always only *1* /sys/class/backlight device.
>
> To be specific atm code supporting old vendor specific backlight
> fw interfaces, e.g. drivers/platform/x86/dell-laptop.c:
> already does:
>
>         if (acpi_video_get_backlight_type() != acpi_backlight_vendor)
>                  return 0;
>
> And drivers/acpi/acpi_video.c also already does:
>
>         if (acpi_video_get_backlight_type() != acpi_backlight_video)
>                  return 0;
>
> Currently looking at the 3 main x86 backlight interfaces: vendor,
> generic-ACPI and native-drm-driver, only the native driver's
> backlight registers unconditionally. The plan is to make those also
> do a similar check (*) and to also add special backlight drivers like
> nvidia-wmi-ec-backlight and drivers/video/backlight/apple_bl.c
> to this mechanism.


nvidia-wmi-ec-backlight does this check during probe:

>         ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>                                    WMI_BRIGHTNESS_MODE_GET, &source);
>         if (ret)
>                 return ret;
>
>         /*
>          * This driver is only to be used when brightness control is 
> handled
>          * by the EC; otherwise, the GPU driver(s) should control 
> brightness.
>          */
>         if (source != WMI_BRIGHTNESS_SOURCE_EC)
>                 return -ENODEV;


And part of the problem with this bug is that what the firmware is 
reporting here is a lie, at least until the first suspend/resume cycle.

Also, unlike the DRM drivers, the NVIDIA proprietary driver does not 
register its backlight handler unconditionally: one of the strange 
things about this system is that the NVIDIA proprietary driver *does* 
register a backlight handler, even though it's definitely not supposed 
to in this case. I did notice that the DRM drivers don't currently seem 
to have any checks to see whether they should register, and indeed on 
the EC backlight systems I have tested, with both amdgpu and i915, there 
is a non-functional sysfs backlight interface exposed by the iGPU. One 
of the other strange things about the system is the fact that the amdgpu 
backlight driver works at all.

> 3. 1 + 2 means that the drm_driver can just tie the single
> backlight_device which will be registered on the system to
> the internal panel.

This sounds kind of ugly to me. If the backlight control is 
GPU-agnostic, as is the case here, then associating the backlight_device 
with a drm_driver doesn't seem right. It sounds like the plan is to move 
from the current sysfs interface to one that's exposed by the DRM 
subsystem. If that's the case, I think there should be a way to do so 
without tying it to a specific GPU driver. e.g., on a muxed system, one 
or the other GPU might be scanning out at any given time (which is why 
many of these systems have non-GPU backlight control), so if the 
interface is GPU-specific, that would be at the very least confusing.


> Again I'm completely ignoring dual-internal-panel devices here
> for simplicity's sake.
>
> Note this is getting a bit off-topic, but if you have insights
> in this, or already can think of ways how this is not going to
> work :)  please let me know.


My concerns listed above are everything I can think of off the top of my 
head. I'm sure if I think about it more I'll come up with other 
concerns. This did get a bit off-topic, but I'm actually very glad 
you've told me all this.


>
> *) And adding that check + the presence of nvidia-wmi-ec-backlight
> support will make the native drm-driver not register it's
> backlight_device at all at which point the backlight-proxy workaround
> from this patch breaks.
>
>
>> This also gives me an idea for another experiment I didn’t think to try earlier. Alex: what happens if you hack amdgpu_atombios_encoder_init_backlight() in the amdgpu driver to just return right away? I wonder if the AMD GPU’s attempt to take over backlight control is what makes the firmware give control to the GPU rather than the EC initially.
>>
>> Regardless of the backlight proxy workaround, does the force refresh one seem reasonable? That one at least addresses a condition that happens at every suspend/resume cycle.
> Good question, I must admit I stopped reading the patch after seeing
> the proxy thing.
>
> I see that you are using a pm_notifer for this. I wonder if you
> can try (on your own system) to add a pm_ops struct and make
> wmi_driver.driver.pm point to that and check if that gets called
> by adding e.g. a pr_info (I don't see why it would not get called).
>
> And assuming that works, using that would be a bit cleaner IMHO.
> Although that does have resume-ordering implications. But I would
> expect the EC to basically be always ready to get talked to at
> the point in the resume cycle where normal (non early) resume
> handlers are called.
>
> To be clear the idea would be to always have the suspend handler
> (so that the driver and pm_ops structs can be const) and to check
> a quirk flag inside the resume handler. Or maybe even just always
> read back the brightness from the hw and check if it has changed?
> Does this need to be behind a quirk ?


Okay, thanks. I missed the wmi_driver.driver.pm when I was looking to 
wire up something to refresh the backlight level on resume; that does 
sound cleaner than registering a notifier. Alex did report that the 
backlight was briefly at maximum brightness following resume, before the 
notifier kicked in and reset it to the correct level, so hopefully 
hooking it up through pm_ops will be able to catch it early enough to 
avoid that. And making the refresh unconditional rather than hidden 
behind a quirk sounds reasonable too; I expect it to be harmless on 
systems that don't need it.


>>> Is there no other way to solve this issue? Maybe we need to poke
>>> vgaswitcheroo to set the current GPU mode even though this is
>>> already reported as active to get things to switch to the ECs
>>> control right away ?
>> There isn’t a vgaswitcheroo handler for this particular mux device (yet), but there are separate ACPI interfaces for the mux itself. Poking the mux *shouldn’t* have any effect on what device is controlling the backlight for the panel, since when the system is in dynamic mux mode the EC is always supposed to be in control, but this system is already showing some weird behavior, so it doesn’t hurt to try.
> Right, as you said the EC is always supposed to be in control, but
> it is not. I would not be surprised if making the ACPI call to put
> things in dynamic mode (even though they already are) fixes this,
> assuming there is such an ACPI call...


Yes, there is one, although changes to the mode aren't supposed to take 
effect until reboot. However, this system already seems to be 
misbehaving, so it's certainly possible that poking at that call will do 
something.


>>> I'm pretty certain that Windows is not doing this backlight proxying,
>>> IMHO we need to figure out what causes the switch after suspend/resume
>>> and then do that thing at boot.
>> I’ll put together an instrumented driver for Alex to try on his system, to capture some more data and see if I can get some more insight into that. I also have a dump of his ACPI tables, and can check there for some other potential leads. Hopefully whatever changes the state across the suspend/resume cycle is response to something that Linux does or doesn’t do, and not some state that is only handled internally within the firmware.
> Great, thank you.
>
> Regards,
>
> Hans
>
>
>
>>>> If a system with the "backlight level reset to full on resume" quirk
>>>> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
>>>> reset the backlight to the previous level upon resume.
>>>>
>>>> These workarounds are also plumbed through to kernel module parameters,
>>>> to make it easier for users who suspect they may be affected by one or
>>>> both of these bugs to test whether these workarounds are effective on
>>>> their systems as well.
>>>>
>>>> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
>>>> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
>>>> ---
>>>> Note: the Tested-by: line above applies to the previous version of this
>>>> patch; an explicit ACK from the tester is required for it to apply to
>>>> the current version.
>>>>
>>>> v2:
>>>> * Add readable sysfs files for module params, use linear interpolation
>>>>    from fixp-arith.h, fix return value of notifier callback, use devm_*()
>>>>    for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
>>>> * Add comment to denote known firmware versions that exhibit the bugs.
>>>>    (Mario Limonciello <Mario.Limonciello@amd.com>)
>>>> * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)
>>>>
>>>> .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
>>>> 1 file changed, 194 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>>> index 61e37194df70..95e1ddf780fc 100644
>>>> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>>> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>>> @@ -3,8 +3,12 @@
>>>>   * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
>>>>   */
>>>>
>>>> +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
>>>> +
>>>> #include <linux/acpi.h>
>>>> #include <linux/backlight.h>
>>>> +#include <linux/dmi.h>
>>>> +#include <linux/fixp-arith.h>
>>>> #include <linux/mod_devicetable.h>
>>>> #include <linux/module.h>
>>>> #include <linux/types.h>
>>>> @@ -75,6 +79,73 @@ struct wmi_brightness_args {
>>>>     u32 ignored[3];
>>>> };
>>>>
>>>> +/**
>>>> + * struct nvidia_wmi_ec_backlight_priv - driver private data
>>>> + * @bl_dev:       the associated backlight device
>>>> + * @proxy_target: backlight device which receives relayed brightness changes
>>>> + * @notifier:     notifier block for resume callback
>>>> + */
>>>> +struct nvidia_wmi_ec_backlight_priv {
>>>> +    struct backlight_device *bl_dev;
>>>> +    struct backlight_device *proxy_target;
>>>> +    struct notifier_block nb;
>>>> +};
>>>> +
>>>> +static char *backlight_proxy_target;
>>>> +module_param(backlight_proxy_target, charp, 0444);
>>>> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
>>>> +
>>>> +static int max_reprobe_attempts = 128;
>>>> +module_param(max_reprobe_attempts, int, 0444);
>>>> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
>>>> +
>>>> +static bool restore_level_on_resume;
>>>> +module_param(restore_level_on_resume, bool, 0444);
>>>> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
>>>> +
>>>> +/* Bit field values for quirks table */
>>>> +
>>>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
>>>> +
>>>> +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
>>>> +
>>>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
>>>> +
>>>> +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
>>>> +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
>>>> +
>>>> +static int assign_quirks(const struct dmi_system_id *id)
>>>> +{
>>>> +    if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
>>>> +        restore_level_on_resume = 1;
>>>> +
>>>> +    /* If the module parameter is set, override the quirks table */
>>>> +    if (!backlight_proxy_target) {
>>>> +        if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
>>>> +            backlight_proxy_target = "amdgpu_bl1";
>>>> +    }
>>>> +
>>>> +    return true;
>>>> +}
>>>> +
>>>> +#define QUIRK_ENTRY(vendor, product, quirks) {          \
>>>> +    .callback = assign_quirks,                      \
>>>> +    .matches = {                                    \
>>>> +        DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
>>>> +        DMI_MATCH(DMI_PRODUCT_VERSION, product) \
>>>> +    },                                              \
>>>> +    .driver_data = (void *)(quirks)                 \
>>>> +}
>>>> +
>>>> +static const struct dmi_system_id quirks_table[] = {
>>>> +    QUIRK_ENTRY(
>>>> +        /* This quirk is preset as of firmware revision HACN31WW */
>>>> +        "LENOVO", "Legion S7 15ACH6",
>>>> +        QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
>>>> +    ),
>>>> +    { }
>>>> +};
>>>> +
>>>> /**
>>>>   * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
>>>>   * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
>>>> @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
>>>>     return 0;
>>>> }
>>>>
>>>> +/* Scale the current brightness level of 'from' to the range of 'to'. */
>>>> +static int scale_backlight_level(const struct backlight_device *from,
>>>> +                 const struct backlight_device *to)
>>>> +{
>>>> +    int from_max = from->props.max_brightness;
>>>> +    int from_level = from->props.brightness;
>>>> +    int to_max = to->props.max_brightness;
>>>> +
>>>> +    return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
>>>> +}
>>>> +
>>>> static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
>>>> {
>>>>     struct wmi_device *wdev = bl_get_data(bd);
>>>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>>>> +    struct backlight_device *proxy_target = priv->proxy_target;
>>>> +
>>>> +    if (proxy_target) {
>>>> +        int level = scale_backlight_level(bd, proxy_target);
>>>> +
>>>> +        if (backlight_device_set_brightness(proxy_target, level))
>>>> +            pr_warn("Failed to relay backlight update to \"%s\"",
>>>> +                backlight_proxy_target);
>>>> +    }
>>>>
>>>>     return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
>>>>                                  WMI_BRIGHTNESS_MODE_SET,
>>>> @@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
>>>>     .get_brightness = nvidia_wmi_ec_backlight_get_brightness,
>>>> };
>>>>
>>>> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
>>>> +{
>>>> +
>>>> +    /*
>>>> +     * On some systems, the EC backlight level gets reset to 100% when
>>>> +     * resuming from suspend, but the backlight device state still reflects
>>>> +     * the pre-suspend value. Refresh the existing state to sync the EC's
>>>> +     * state back up with the kernel's.
>>>> +     */
>>>> +    if (event == PM_POST_SUSPEND) {
>>>> +        struct nvidia_wmi_ec_backlight_priv *p;
>>>> +        int ret;
>>>> +
>>>> +        p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
>>>> +        ret = backlight_update_status(p->bl_dev);
>>>> +
>>>> +        if (ret)
>>>> +            pr_warn("failed to refresh backlight level: %d", ret);
>>>> +
>>>> +        return NOTIFY_OK;
>>>> +    }
>>>> +
>>>> +    return NOTIFY_DONE;
>>>> +}
>>>> +
>>>> +static void putdev(void *data)
>>>> +{
>>>> +    struct device *dev = data;
>>>> +
>>>> +    put_device(dev);
>>>> +}
>>>> +
>>>> static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
>>>> {
>>>> +    struct backlight_device *bdev, *target = NULL;
>>>> +    struct nvidia_wmi_ec_backlight_priv *priv;
>>>>     struct backlight_properties props = {};
>>>> -    struct backlight_device *bdev;
>>>>     u32 source;
>>>>     int ret;
>>>>
>>>> +    /*
>>>> +     * Check quirks tables to see if this system needs any of the firmware
>>>> +     * bug workarounds.
>>>> +     */
>>>> +    dmi_check_system(quirks_table);
>>>> +
>>>> +    if (backlight_proxy_target && backlight_proxy_target[0]) {
>>>> +        static int num_reprobe_attempts;
>>>> +
>>>> +        target = backlight_device_get_by_name(backlight_proxy_target);
>>>> +
>>>> +        if (target) {
>>>> +            ret = devm_add_action_or_reset(&wdev->dev, putdev,
>>>> +                               &target->dev);
>>>> +            if (ret)
>>>> +                return ret;
>>>> +        } else {
>>>> +            /*
>>>> +             * The target backlight device might not be ready;
>>>> +             * try again and disable backlight proxying if it
>>>> +             * fails too many times.
>>>> +             */
>>>> +            if (num_reprobe_attempts < max_reprobe_attempts) {
>>>> +                num_reprobe_attempts++;
>>>> +                return -EPROBE_DEFER;
>>>> +            }
>>>> +
>>>> +            pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
>>>> +                backlight_proxy_target, max_reprobe_attempts);
>>>> +        }
>>>> +    }
>>>> +
>>>>     ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>>>>                                WMI_BRIGHTNESS_MODE_GET, &source);
>>>>     if (ret)
>>>> @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
>>>>                           &wdev->dev, wdev,
>>>>                           &nvidia_wmi_ec_backlight_ops,
>>>>                           &props);
>>>> -    return PTR_ERR_OR_ZERO(bdev);
>>>> +
>>>> +    if (IS_ERR(bdev))
>>>> +        return PTR_ERR(bdev);
>>>> +
>>>> +    priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
>>>> +    if (!priv)
>>>> +        return -ENOMEM;
>>>> +
>>>> +    priv->bl_dev = bdev;
>>>> +
>>>> +    dev_set_drvdata(&wdev->dev, priv);
>>>> +
>>>> +    if (target) {
>>>> +        int level = scale_backlight_level(target, bdev);
>>>> +
>>>> +        if (backlight_device_set_brightness(bdev, level))
>>>> +            pr_warn("Unable to import initial brightness level from %s.",
>>>> +                backlight_proxy_target);
>>>> +        priv->proxy_target = target;
>>>> +    }
>>>> +
>>>> +    if (restore_level_on_resume) {
>>>> +        priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
>>>> +        register_pm_notifier(&priv->nb);
>>>> +    }
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
>>>> +{
>>>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>>>> +
>>>> +    if (priv->nb.notifier_call)
>>>> +        unregister_pm_notifier(&priv->nb);
>>>> }
>>>>
>>>> #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
>>>> @@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
>>>>         .name = "nvidia-wmi-ec-backlight",
>>>>     },
>>>>     .probe = nvidia_wmi_ec_backlight_probe,
>>>> +    .remove = nvidia_wmi_ec_backlight_remove,
>>>>     .id_table = nvidia_wmi_ec_backlight_id_table,
>>>> };
>>>> module_wmi_driver(nvidia_wmi_ec_backlight_driver);

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-17 17:35             ` Alex Deucher
@ 2022-03-17 18:50                 ` Daniel Dadap
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel Dadap @ 2022-03-17 18:50 UTC (permalink / raw)
  To: Alex Deucher, Hans de Goede
  Cc: dri-devel, platform-driver-x86, markgross, pobrn, Alexandru Dinu,
	Mario.Limonciello

On 3/17/22 12:35, Alex Deucher wrote:
> Sorry for jumping in here, but I can't seem to find the original
> thread with this comment.  amdgpu_atombios_encoder_init_backlight() is
> not applicable to these systems.  That is the old pre-DC code path.
> You want amdgpu_dm_register_backlight_device() for modern hardware.


Oops, thanks for the correction. Alex Dinu: see the above for the 
correct code path to disable to test whether not registering the amdgpu 
backlight device helps. I have some other things to attend to, so it 
will be a little while before I can get you the instrumented driver I 
mentioned in one of my replies to Hans, but hopefully we'll be able to 
figure something out to actually switch the backlight control to EC 
without having to do a suspend/resume cycle.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
@ 2022-03-17 18:50                 ` Daniel Dadap
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel Dadap @ 2022-03-17 18:50 UTC (permalink / raw)
  To: Alex Deucher, Hans de Goede
  Cc: dri-devel, platform-driver-x86, markgross, pobrn,
	Mario.Limonciello, Alexandru Dinu

On 3/17/22 12:35, Alex Deucher wrote:
> Sorry for jumping in here, but I can't seem to find the original
> thread with this comment.  amdgpu_atombios_encoder_init_backlight() is
> not applicable to these systems.  That is the old pre-DC code path.
> You want amdgpu_dm_register_backlight_device() for modern hardware.


Oops, thanks for the correction. Alex Dinu: see the above for the 
correct code path to disable to test whether not registering the amdgpu 
backlight device helps. I have some other things to attend to, so it 
will be a little while before I can get you the instrumented driver I 
mentioned in one of my replies to Hans, but hopefully we'll be able to 
figure something out to actually switch the backlight control to EC 
without having to do a suspend/resume cycle.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-17 18:36               ` Daniel Dadap
@ 2022-03-18 17:42                 ` Hans de Goede
  -1 siblings, 0 replies; 31+ messages in thread
From: Hans de Goede @ 2022-03-18 17:42 UTC (permalink / raw)
  To: Daniel Dadap
  Cc: platform-driver-x86, pobrn, markgross, Mario.Limonciello,
	Alexandru Dinu, dri-devel, Daniel Vetter

Hi Daniel,

On 3/17/22 19:36, Daniel Dadap wrote:
> 
> On 3/17/22 11:42, Hans de Goede wrote:
>> Hi Daniel,
>>
>> On 3/17/22 14:28, Daniel Dadap wrote:
>>>> On Mar 17, 2022, at 07:17, Hans de Goede <hdegoede@redhat.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>>> On 3/16/22 21:33, Daniel Dadap wrote:
>>>>> Some notebook systems with EC-driven backlight control appear to have a
>>>>> firmware bug which causes the system to use GPU-driven backlight control
>>>>> upon a fresh boot, but then switches to EC-driven backlight control
>>>>> after completing a suspend/resume cycle. All the while, the firmware
>>>>> reports that the backlight is under EC control, regardless of what is
>>>>> actually controlling the backlight brightness.
>>>>>
>>>>> This leads to the following behavior:
>>>>>
>>>>> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>>>>>   WMI-wrapped ACPI method erroneously reporting EC control.
>>>>> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>>>>>   cycle, due to the backlight control actually being GPU-driven.
>>>>> * GPU drivers also register their own backlight handlers: in the case
>>>>>   of the notebook system where this behavior has been observed, both
>>>>>   amdgpu and the NVIDIA proprietary driver register backlight handlers.
>>>>> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>>>>>   case observed so far) can successfully control the backlight through
>>>>>   its backlight driver's sysfs interface, but stops working after the
>>>>>   first suspend/resume cycle.
>>>>> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>>>>>   fresh boot, but begins to work after the first suspend/resume cycle.
>>>>> * The GPU which does not have backlight control (NVIDIA in this case)
>>>>>   is not able to control the backlight at any point while the system
>>>>>   is in operation. On similar hybrid systems with an EC-controlled
>>>>>   backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>>>>>   does not register its backlight handler. It has not been determined
>>>>>   whether the non-functional handler registered by the NVIDIA driver
>>>>>   is due to another firmware bug, or a bug in the NVIDIA driver.
>>>>>
>>>>> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
>>>>> device, it takes precedence over the BACKLIGHT_RAW devices registered
>>>>> by the GPU drivers. This in turn leads to backlight control appearing
>>>>> to be non-functional until after completing a suspend/resume cycle.
>>>>> However, it is still possible to control the backlight through direct
>>>>> interaction with the working GPU driver's backlight sysfs interface.
>>>>>
>>>>> These systems also appear to have a second firmware bug which resets
>>>>> the EC's brightness level to 100% on resume, but leaves the state in
>>>>> the kernel at the pre-suspend level. This causes attempts to save
>>>>> and restore the backlight level across the suspend/resume cycle to
>>>>> fail, due to the level appearing not to change even though it did.
>>>>>
>>>>> In order to work around these issues, add a quirk table to detect
>>>>> systems that are known to show these behaviors. So far, there is
>>>>> only one known system that requires these workarounds, and both
>>>>> issues are present on that system, but the quirks are tracked
>>>>> separately to make it easier to add them to other systems which
>>>>> may exhibit one of the bugs, but not the other. The original systems
>>>>> that this driver was tested on during development do not exhibit
>>>>> either of these quirks.
>>>>>
>>>>> If a system with the "GPU driver has backlight control" quirk is
>>>>> detected, nvidia-wmi-ec-backlight will grab a reference to the working
>>>>> (when freshly booted) GPU backlight handler and relays any backlight
>>>>> brightness level change requests directed at the EC to also be applied
>>>>> to the GPU backlight interface. This leads to redundant updates
>>>>> directed at the GPU backlight driver after a suspend/resume cycle, but
>>>>> it does allow the EC backlight control to work when the system is
>>>>> freshly booted.
>>>> Ugh, I'm really not a fan of the backlight proxy plan here. I have
>>>> plans to clean-up the whole x86 backlight mess soon and an important part
>>>> of that is to stop registering multiple backlight interfaces for the
>>>> same panel/screen.
>>>>
>>>> Where as going with this workaround requires us to have 2 active
>>>> backlight interfaces active. Also this will very likely work to
>>>> (subtly) different backlight behavior before and after the first
>>>> suspend/resume.
>>> I understand. Having multiple backlight devices for the same panel is indeed annoying. Out of curiosity, what is the plan for determining that multiple backlight interfaces are all supposed to control the same panel?
>> ATM the kernel basically only supports a bunch of different methods
>> to control the backlight of 1 internal panel. The plan is to tie this
>> to the panel from a userspace pov by making the brightness +
>> max_brightness properties on the drm_connector object for the
>> internal-panel.
>>
>> The in kernel tying of the backlight device to the internal panel
>> will be done hardcoded inside the drm driver(s) based on the
>> drivers already knowing which connector is the internal panel.
> 
> 
> Okay. At the moment the other problem I am thinking about also makes a one-internal-panel assumption, and it's true for the particular hardware that I'm working with, but I didn't like that assumption. If there isn't something existing that can be used to link connectors from multiple GPUs together to indicate they actually (potentially) drive the same display panel, then I suppose I'll continue assuming that. But I really do think the >1 display case needs to be more than just an afterthought, even if the solution, for now, is to just have all the drivers agree that a single internal panel is e.g. index 0 of some shared table that the drivers can all consult so that everybody knows which panel is which.
> 
> I am also concerned about muxed designs that don't have an EC-controlled backlight. If there's only allowed to be one backlight device per panel, then that means vga_switcheroo would have to do something to disable the outgoing GPU's backlight control and enable the incoming one's.

The one backlight device rule mainly applies to the case where the GPU driver's native backlight control is not used, so it needs to find another backlight-device (e.gf. acpi_video or  nvidia-wmi-ec-backlight) and proxy that to the properties on the panel's drm_connector.

If the native backlight control of the 2 GPUs is used then their is no such constriant since the GPU driver in that case knows which backlight-device to use itself.

Currently acpi_video_get_backlight_type() has the following return values:

enum acpi_backlight_type {
        acpi_backlight_undef = -1,
        acpi_backlight_none = 0,
        acpi_backlight_video,
        acpi_backlight_vendor,
        acpi_backlight_native,
};

The idea is to extend this to:

enum acpi_backlight_type {
        acpi_backlight_undef = -1,
        acpi_backlight_none = 0,
        acpi_backlight_video,
        acpi_backlight_vendor,
        acpi_backlight_native,
	acpi_backlight_nvidia_wmi_ec,
	acpi_backlight_apple_gmux,
};

And then have *all* (x86) backlight drivers do:

         if (acpi_video_get_backlight_type() != acpi_backlight_<my_type>)
                 return 0;

before registering the backlight-device. This is currently already
done by the acpi_backlight_video and acpi_backlight_vendor type
drivers, but not by the other ones.

In case of there being 2 native GPIU driver backlight-device (with the active GPU being in actual control) then both GPU drivers will do:

         if (acpi_video_get_backlight_type() != acpi_backlight_native)
                 return 0;

and since acpi_video_get_backlight_type() will return acpi_backlight_native in this case, the if condition will be false and both will continue with registering their (native) backlight device. Each offering the new brightness properties on the drm_connector for *their* internal display connection.

One of those 2 internal-panel drm_connectors should always be in a disconnected state, so this way userspace can choose which one to use (the one which is actually connected).

I believe something similar is already done by userspace on devices like this. For native backlight-devices userspace can find the matching GPU by looking at the parent device and I remember at least discussing userspace checking which GPU is connected to the panel to chose which backlight device to use on systems with 2 native backlight devices.

> And then all the userspace software that controls the sysfs backlight interf	aces would need to make sure to check to see what backlight devices are exposed every time a brightness change request is made, and not e.g. just once at init and then assume that the backlight devices won't change over the lifetime of whatever backlight manager software we're talking about (e.g. gnome-settings-daemon). I suppose if what the kernel exposes is an abstraction so that userspace only sees one backlight interface per panel in sysfs at any given time, which might actually be connected to one of several different drivers in the backlight subsystem, that would make it a little better, but I still think there would
> be potential for races between a mux switch and a brightness change event.

See above in this case there will be 2 separate drm_connector-s and on a dynamic switch userspace will see a display disconnect + reconnect and userspace really needs to see this to be able to setup things like prime properly.

> 
>> This all naively assumes there is only 1 internal panel, which
>> for the majority of cases is true. My plan for devices with
>> 2 internal panels is to cross that bridge when we get there
>> (I expect those mostly in phone/tablet like devices for now
>> which will likely use devicetree where solving this is trivial).
>>
>> I do realize we will eventually get some x86/acpi device with
>> 2 internal panels. Hopefully we can just figure out what
>> the Windows drivers are doing there and parse e.g. the ACPI
>> info which Windows is using for this.
> 
> 
> I'm not aware of any more modern examples, but there already have been such systems. See e.g. Lenovo ThinkPad W701ds. IIRC that system was discrete GPU only, so there wouldn't be any concern about coordinating panel IDs between different GPU drivers, but there is a very real possibility that a vendor may want to bring back a similar design with hybrid GPUs. The asymmetry of that system always bothered me a bit: I imagine that if the NVIDIA GPUs of that era supported more than two heads per GPU, we may have seen a three-screen notebook.
> 
> It's very possible that some of the modern exotic e.g. folding designs show up as dual displays, but I don't have any first-hand experience with those to know whether that is the case. I definitely remember the W701ds exposed the two displays as separate entities. (And the one that popped off to the side was rotated, which was fun.)
> 
> 
>> As part of the move to properties on the drm_connector object
>> the /sys/class/backlight interface will become deprecated,
>> but will be kept for backward compat and will eventually
>> be put behind a Kconfig option.
> 
> 
> If you have a high-level design for this written down somewhere, do you mind sharing it? I want to get an idea of what types of changes the proprietary nvidia-drm driver might need to participate in this.

I don't really have a high level design written down yet.

Basically the plan is to have some generic helper code to allow a GPU driver to proxy a backlight device (including its own native backlight) to a set of properties on the drm_connector object. As mentioned before the idea is to keep the backlight-device API as kernel-internal API for this since we need this for backward compat and we need some sort of internal API between e.g. nvidia-wmi-ec-backligt and the GPU drivers anyways.

It will be the GPU drivers responsibility to pick a backlight-device + connector to pair up. Although I likely will add a helper for that too which
will take the GPU drivers own native backlight-device (may be NULL) + a drm-connector as input and then based on the acpi_video_get_backlight_type() return value find a backlight-device to proxy (iow it may end up not using the native backlight-device even if one is passed in).

I expect the helpers for this to also be suitable for the proprietary nvidia driver, but I guess that also depends on how much of the existing drm-helpers you already use.

>> The kernel internal backlight_device stuff will be kept
>> since we need some internal representation anyways and
>> I don't see much value in reworking that, esp. since
>> we need to have /sys/class/backlight backward compat.
>>
>> Note this is all based on discussions which I had with
>> mainly Daniel Vetter @plumbers 2019 in Lisabon. I have
>> never gotten around to actually start working on this,
>> but this has resurfaced recently and I plan to actually
>> take a stab at implementing this plan sometime during 2022.
>>
>>> I’m not sure I’m aware of any driver-agnostic ways of identifying a particular panel instance uniquely, and if there is one, that would actually help with something else I’m working on at the moment. If the idea involves e.g. the EDID, that could be troublesome for backlight drivers such as this one which aren’t associated with any GPU.
>> Right, see above the main idea is to make this
>> "the kernel's problem" and I expect us to fix this in
>> the kernel in a variety of different ways depending on
>> the actual hardware.
>>
>> As for "troublesome for backlight drivers such as this one
>> which aren’t associated with any GPU.", the idea is that:
>>
>> 1. E.g the i915 driver (which I have the most experience with)
>> knows which connector is the internal panel
> 
> 
> Sure, the NVIDIA proprietary driver knows this as well, and what I've seen from the other DRM drivers also suggests that this is an easy determination to make.
> 
> 
>> 2. The acpi_video_get_backlight_type() helper from
>> drivers/acpi/video_detect.c will get extended to make sure
>> that there is always only *1* /sys/class/backlight device.
>>
>> To be specific atm code supporting old vendor specific backlight
>> fw interfaces, e.g. drivers/platform/x86/dell-laptop.c:
>> already does:
>>
>>         if (acpi_video_get_backlight_type() != acpi_backlight_vendor)
>>                  return 0;
>>
>> And drivers/acpi/acpi_video.c also already does:
>>
>>         if (acpi_video_get_backlight_type() != acpi_backlight_video)
>>                  return 0;
>>
>> Currently looking at the 3 main x86 backlight interfaces: vendor,
>> generic-ACPI and native-drm-driver, only the native driver's
>> backlight registers unconditionally. The plan is to make those also
>> do a similar check (*) and to also add special backlight drivers like
>> nvidia-wmi-ec-backlight and drivers/video/backlight/apple_bl.c
>> to this mechanism.
> 
> 
> nvidia-wmi-ec-backlight does this check during probe:
> 
>>         ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>>                                    WMI_BRIGHTNESS_MODE_GET, &source);
>>         if (ret)
>>                 return ret;
>>
>>         /*
>>          * This driver is only to be used when brightness control is handled
>>          * by the EC; otherwise, the GPU driver(s) should control brightness.
>>          */
>>         if (source != WMI_BRIGHTNESS_SOURCE_EC)
>>                 return -ENODEV;

Right, we will need to do something similar inside drivers/acpi/video_detect.c
in the future to make acpi_video_get_backlight_type() return 
acpi_backlight_nvidia_wmi_ec on systems which use this fw interface.

> 
> 
> And part of the problem with this bug is that what the firmware is reporting here is a lie, at least until the first suspend/resume cycle.
> 
> Also, unlike the DRM drivers, the NVIDIA proprietary driver does not register its backlight handler unconditionally: one of the strange things about this system is that the NVIDIA proprietary driver *does* register a backlight handler, even though it's definitely not supposed to in this case. I did notice that the DRM drivers don't currently seem to have any checks to see whether they should register, and indeed on the EC backlight systems I have tested, with both amdgpu and i915, there is a non-functional sysfs backlight interface exposed by the iGPU. One of the other strange things about the system is the fact that the amdgpu backlight driver works at all.


Right, note that if making the amdgpu driver not register it backlight handler,
then we might add acpi_backlight_nvidia_wmi_ec support to acpi_video_get_backlight_type()
sooner (now) and add a:

         if (acpi_video_get_backlight_type() != acpi_backlight_native)
                 return 0;

to amdgpu now, to fix this.

>> 3. 1 + 2 means that the drm_driver can just tie the single
>> backlight_device which will be registered on the system to
>> the internal panel.
> 
> This sounds kind of ugly to me. If the backlight control is GPU-agnostic, as is the case here, then associating the backlight_device with a drm_driver doesn't seem right. It sounds like the plan is to move from the current sysfs interface to one that's exposed by the DRM subsystem. If that's the case, I think there should be a way to do so without tying it to a specific GPU driver. e.g., on a muxed system, one or the other GPU might be scanning out at any given time (which is why many of these systems have non-GPU backlight control), so if the interface is GPU-specific, that would be at the very least confusing.

Most muxed systems that I know don't support dynamic switching, so we would tie e.g.
the nvidia-wmi-ec-backlight device to the internal-panel drm_connector of the GPU
which actually is connected to the panel.

For dynamic switching the drm_connector_s_ for the internal panel would both
proxy the nvidia-wmi-ec-backlight device and userspace will use the one for
connector which is actually connected at a given time.

>> Again I'm completely ignoring dual-internal-panel devices here
>> for simplicity's sake.
>>
>> Note this is getting a bit off-topic, but if you have insights
>> in this, or already can think of ways how this is not going to
>> work :)  please let me know.
> 
> 
> My concerns listed above are everything I can think of off the top of my head. I'm sure if I think about it more I'll come up with other concerns. This did get a bit off-topic, but I'm actually very glad you've told me all this.

Yeah it is good to discuss this now. It will likely be a couple
of months before I can actually post a RFC series implemeting
my ideas for this.

Regards,

Hans


> 
> 
>>
>> *) And adding that check + the presence of nvidia-wmi-ec-backlight
>> support will make the native drm-driver not register it's
>> backlight_device at all at which point the backlight-proxy workaround
>> from this patch breaks.
>>
>>
>>> This also gives me an idea for another experiment I didn’t think to try earlier. Alex: what happens if you hack amdgpu_atombios_encoder_init_backlight() in the amdgpu driver to just return right away? I wonder if the AMD GPU’s attempt to take over backlight control is what makes the firmware give control to the GPU rather than the EC initially.
>>>
>>> Regardless of the backlight proxy workaround, does the force refresh one seem reasonable? That one at least addresses a condition that happens at every suspend/resume cycle.
>> Good question, I must admit I stopped reading the patch after seeing
>> the proxy thing.
>>
>> I see that you are using a pm_notifer for this. I wonder if you
>> can try (on your own system) to add a pm_ops struct and make
>> wmi_driver.driver.pm point to that and check if that gets called
>> by adding e.g. a pr_info (I don't see why it would not get called).
>>
>> And assuming that works, using that would be a bit cleaner IMHO.
>> Although that does have resume-ordering implications. But I would
>> expect the EC to basically be always ready to get talked to at
>> the point in the resume cycle where normal (non early) resume
>> handlers are called.
>>
>> To be clear the idea would be to always have the suspend handler
>> (so that the driver and pm_ops structs can be const) and to check
>> a quirk flag inside the resume handler. Or maybe even just always
>> read back the brightness from the hw and check if it has changed?
>> Does this need to be behind a quirk ?
> 
> 
> Okay, thanks. I missed the wmi_driver.driver.pm when I was looking to wire up something to refresh the backlight level on resume; that does sound cleaner than registering a notifier. Alex did report that the backlight was briefly at maximum brightness following resume, before the notifier kicked in and reset it to the correct level, so hopefully hooking it up through pm_ops will be able to catch it early enough to avoid that. And making the refresh unconditional rather than hidden behind a quirk sounds reasonable too; I expect it to be harmless on systems that don't need it.
> 
> 
>>>> Is there no other way to solve this issue? Maybe we need to poke
>>>> vgaswitcheroo to set the current GPU mode even though this is
>>>> already reported as active to get things to switch to the ECs
>>>> control right away ?
>>> There isn’t a vgaswitcheroo handler for this particular mux device (yet), but there are separate ACPI interfaces for the mux itself. Poking the mux *shouldn’t* have any effect on what device is controlling the backlight for the panel, since when the system is in dynamic mux mode the EC is always supposed to be in control, but this system is already showing some weird behavior, so it doesn’t hurt to try.
>> Right, as you said the EC is always supposed to be in control, but
>> it is not. I would not be surprised if making the ACPI call to put
>> things in dynamic mode (even though they already are) fixes this,
>> assuming there is such an ACPI call...
> 
> 
> Yes, there is one, although changes to the mode aren't supposed to take effect until reboot. However, this system already seems to be misbehaving, so it's certainly possible that poking at that call will do something.
> 
> 
>>>> I'm pretty certain that Windows is not doing this backlight proxying,
>>>> IMHO we need to figure out what causes the switch after suspend/resume
>>>> and then do that thing at boot.
>>> I’ll put together an instrumented driver for Alex to try on his system, to capture some more data and see if I can get some more insight into that. I also have a dump of his ACPI tables, and can check there for some other potential leads. Hopefully whatever changes the state across the suspend/resume cycle is response to something that Linux does or doesn’t do, and not some state that is only handled internally within the firmware.
>> Great, thank you.
>>
>> Regards,
>>
>> Hans
>>
>>
>>
>>>>> If a system with the "backlight level reset to full on resume" quirk
>>>>> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
>>>>> reset the backlight to the previous level upon resume.
>>>>>
>>>>> These workarounds are also plumbed through to kernel module parameters,
>>>>> to make it easier for users who suspect they may be affected by one or
>>>>> both of these bugs to test whether these workarounds are effective on
>>>>> their systems as well.
>>>>>
>>>>> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
>>>>> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
>>>>> ---
>>>>> Note: the Tested-by: line above applies to the previous version of this
>>>>> patch; an explicit ACK from the tester is required for it to apply to
>>>>> the current version.
>>>>>
>>>>> v2:
>>>>> * Add readable sysfs files for module params, use linear interpolation
>>>>>    from fixp-arith.h, fix return value of notifier callback, use devm_*()
>>>>>    for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
>>>>> * Add comment to denote known firmware versions that exhibit the bugs.
>>>>>    (Mario Limonciello <Mario.Limonciello@amd.com>)
>>>>> * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)
>>>>>
>>>>> .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
>>>>> 1 file changed, 194 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>>>> index 61e37194df70..95e1ddf780fc 100644
>>>>> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>>>> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>>>> @@ -3,8 +3,12 @@
>>>>>   * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
>>>>>   */
>>>>>
>>>>> +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
>>>>> +
>>>>> #include <linux/acpi.h>
>>>>> #include <linux/backlight.h>
>>>>> +#include <linux/dmi.h>
>>>>> +#include <linux/fixp-arith.h>
>>>>> #include <linux/mod_devicetable.h>
>>>>> #include <linux/module.h>
>>>>> #include <linux/types.h>
>>>>> @@ -75,6 +79,73 @@ struct wmi_brightness_args {
>>>>>     u32 ignored[3];
>>>>> };
>>>>>
>>>>> +/**
>>>>> + * struct nvidia_wmi_ec_backlight_priv - driver private data
>>>>> + * @bl_dev:       the associated backlight device
>>>>> + * @proxy_target: backlight device which receives relayed brightness changes
>>>>> + * @notifier:     notifier block for resume callback
>>>>> + */
>>>>> +struct nvidia_wmi_ec_backlight_priv {
>>>>> +    struct backlight_device *bl_dev;
>>>>> +    struct backlight_device *proxy_target;
>>>>> +    struct notifier_block nb;
>>>>> +};
>>>>> +
>>>>> +static char *backlight_proxy_target;
>>>>> +module_param(backlight_proxy_target, charp, 0444);
>>>>> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
>>>>> +
>>>>> +static int max_reprobe_attempts = 128;
>>>>> +module_param(max_reprobe_attempts, int, 0444);
>>>>> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
>>>>> +
>>>>> +static bool restore_level_on_resume;
>>>>> +module_param(restore_level_on_resume, bool, 0444);
>>>>> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
>>>>> +
>>>>> +/* Bit field values for quirks table */
>>>>> +
>>>>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
>>>>> +
>>>>> +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
>>>>> +
>>>>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
>>>>> +
>>>>> +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
>>>>> +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
>>>>> +
>>>>> +static int assign_quirks(const struct dmi_system_id *id)
>>>>> +{
>>>>> +    if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
>>>>> +        restore_level_on_resume = 1;
>>>>> +
>>>>> +    /* If the module parameter is set, override the quirks table */
>>>>> +    if (!backlight_proxy_target) {
>>>>> +        if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
>>>>> +            backlight_proxy_target = "amdgpu_bl1";
>>>>> +    }
>>>>> +
>>>>> +    return true;
>>>>> +}
>>>>> +
>>>>> +#define QUIRK_ENTRY(vendor, product, quirks) {          \
>>>>> +    .callback = assign_quirks,                      \
>>>>> +    .matches = {                                    \
>>>>> +        DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
>>>>> +        DMI_MATCH(DMI_PRODUCT_VERSION, product) \
>>>>> +    },                                              \
>>>>> +    .driver_data = (void *)(quirks)                 \
>>>>> +}
>>>>> +
>>>>> +static const struct dmi_system_id quirks_table[] = {
>>>>> +    QUIRK_ENTRY(
>>>>> +        /* This quirk is preset as of firmware revision HACN31WW */
>>>>> +        "LENOVO", "Legion S7 15ACH6",
>>>>> +        QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
>>>>> +    ),
>>>>> +    { }
>>>>> +};
>>>>> +
>>>>> /**
>>>>>   * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
>>>>>   * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
>>>>> @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
>>>>>     return 0;
>>>>> }
>>>>>
>>>>> +/* Scale the current brightness level of 'from' to the range of 'to'. */
>>>>> +static int scale_backlight_level(const struct backlight_device *from,
>>>>> +                 const struct backlight_device *to)
>>>>> +{
>>>>> +    int from_max = from->props.max_brightness;
>>>>> +    int from_level = from->props.brightness;
>>>>> +    int to_max = to->props.max_brightness;
>>>>> +
>>>>> +    return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
>>>>> +}
>>>>> +
>>>>> static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
>>>>> {
>>>>>     struct wmi_device *wdev = bl_get_data(bd);
>>>>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>>>>> +    struct backlight_device *proxy_target = priv->proxy_target;
>>>>> +
>>>>> +    if (proxy_target) {
>>>>> +        int level = scale_backlight_level(bd, proxy_target);
>>>>> +
>>>>> +        if (backlight_device_set_brightness(proxy_target, level))
>>>>> +            pr_warn("Failed to relay backlight update to \"%s\"",
>>>>> +                backlight_proxy_target);
>>>>> +    }
>>>>>
>>>>>     return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
>>>>>                                  WMI_BRIGHTNESS_MODE_SET,
>>>>> @@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
>>>>>     .get_brightness = nvidia_wmi_ec_backlight_get_brightness,
>>>>> };
>>>>>
>>>>> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
>>>>> +{
>>>>> +
>>>>> +    /*
>>>>> +     * On some systems, the EC backlight level gets reset to 100% when
>>>>> +     * resuming from suspend, but the backlight device state still reflects
>>>>> +     * the pre-suspend value. Refresh the existing state to sync the EC's
>>>>> +     * state back up with the kernel's.
>>>>> +     */
>>>>> +    if (event == PM_POST_SUSPEND) {
>>>>> +        struct nvidia_wmi_ec_backlight_priv *p;
>>>>> +        int ret;
>>>>> +
>>>>> +        p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
>>>>> +        ret = backlight_update_status(p->bl_dev);
>>>>> +
>>>>> +        if (ret)
>>>>> +            pr_warn("failed to refresh backlight level: %d", ret);
>>>>> +
>>>>> +        return NOTIFY_OK;
>>>>> +    }
>>>>> +
>>>>> +    return NOTIFY_DONE;
>>>>> +}
>>>>> +
>>>>> +static void putdev(void *data)
>>>>> +{
>>>>> +    struct device *dev = data;
>>>>> +
>>>>> +    put_device(dev);
>>>>> +}
>>>>> +
>>>>> static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
>>>>> {
>>>>> +    struct backlight_device *bdev, *target = NULL;
>>>>> +    struct nvidia_wmi_ec_backlight_priv *priv;
>>>>>     struct backlight_properties props = {};
>>>>> -    struct backlight_device *bdev;
>>>>>     u32 source;
>>>>>     int ret;
>>>>>
>>>>> +    /*
>>>>> +     * Check quirks tables to see if this system needs any of the firmware
>>>>> +     * bug workarounds.
>>>>> +     */
>>>>> +    dmi_check_system(quirks_table);
>>>>> +
>>>>> +    if (backlight_proxy_target && backlight_proxy_target[0]) {
>>>>> +        static int num_reprobe_attempts;
>>>>> +
>>>>> +        target = backlight_device_get_by_name(backlight_proxy_target);
>>>>> +
>>>>> +        if (target) {
>>>>> +            ret = devm_add_action_or_reset(&wdev->dev, putdev,
>>>>> +                               &target->dev);
>>>>> +            if (ret)
>>>>> +                return ret;
>>>>> +        } else {
>>>>> +            /*
>>>>> +             * The target backlight device might not be ready;
>>>>> +             * try again and disable backlight proxying if it
>>>>> +             * fails too many times.
>>>>> +             */
>>>>> +            if (num_reprobe_attempts < max_reprobe_attempts) {
>>>>> +                num_reprobe_attempts++;
>>>>> +                return -EPROBE_DEFER;
>>>>> +            }
>>>>> +
>>>>> +            pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
>>>>> +                backlight_proxy_target, max_reprobe_attempts);
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>>     ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>>>>>                                WMI_BRIGHTNESS_MODE_GET, &source);
>>>>>     if (ret)
>>>>> @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
>>>>>                           &wdev->dev, wdev,
>>>>>                           &nvidia_wmi_ec_backlight_ops,
>>>>>                           &props);
>>>>> -    return PTR_ERR_OR_ZERO(bdev);
>>>>> +
>>>>> +    if (IS_ERR(bdev))
>>>>> +        return PTR_ERR(bdev);
>>>>> +
>>>>> +    priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
>>>>> +    if (!priv)
>>>>> +        return -ENOMEM;
>>>>> +
>>>>> +    priv->bl_dev = bdev;
>>>>> +
>>>>> +    dev_set_drvdata(&wdev->dev, priv);
>>>>> +
>>>>> +    if (target) {
>>>>> +        int level = scale_backlight_level(target, bdev);
>>>>> +
>>>>> +        if (backlight_device_set_brightness(bdev, level))
>>>>> +            pr_warn("Unable to import initial brightness level from %s.",
>>>>> +                backlight_proxy_target);
>>>>> +        priv->proxy_target = target;
>>>>> +    }
>>>>> +
>>>>> +    if (restore_level_on_resume) {
>>>>> +        priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
>>>>> +        register_pm_notifier(&priv->nb);
>>>>> +    }
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
>>>>> +{
>>>>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>>>>> +
>>>>> +    if (priv->nb.notifier_call)
>>>>> +        unregister_pm_notifier(&priv->nb);
>>>>> }
>>>>>
>>>>> #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
>>>>> @@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
>>>>>         .name = "nvidia-wmi-ec-backlight",
>>>>>     },
>>>>>     .probe = nvidia_wmi_ec_backlight_probe,
>>>>> +    .remove = nvidia_wmi_ec_backlight_remove,
>>>>>     .id_table = nvidia_wmi_ec_backlight_id_table,
>>>>> };
>>>>> module_wmi_driver(nvidia_wmi_ec_backlight_driver);
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
@ 2022-03-18 17:42                 ` Hans de Goede
  0 siblings, 0 replies; 31+ messages in thread
From: Hans de Goede @ 2022-03-18 17:42 UTC (permalink / raw)
  To: Daniel Dadap
  Cc: dri-devel, platform-driver-x86, markgross, pobrn, Alexandru Dinu,
	Mario.Limonciello

Hi Daniel,

On 3/17/22 19:36, Daniel Dadap wrote:
> 
> On 3/17/22 11:42, Hans de Goede wrote:
>> Hi Daniel,
>>
>> On 3/17/22 14:28, Daniel Dadap wrote:
>>>> On Mar 17, 2022, at 07:17, Hans de Goede <hdegoede@redhat.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>>> On 3/16/22 21:33, Daniel Dadap wrote:
>>>>> Some notebook systems with EC-driven backlight control appear to have a
>>>>> firmware bug which causes the system to use GPU-driven backlight control
>>>>> upon a fresh boot, but then switches to EC-driven backlight control
>>>>> after completing a suspend/resume cycle. All the while, the firmware
>>>>> reports that the backlight is under EC control, regardless of what is
>>>>> actually controlling the backlight brightness.
>>>>>
>>>>> This leads to the following behavior:
>>>>>
>>>>> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>>>>>   WMI-wrapped ACPI method erroneously reporting EC control.
>>>>> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>>>>>   cycle, due to the backlight control actually being GPU-driven.
>>>>> * GPU drivers also register their own backlight handlers: in the case
>>>>>   of the notebook system where this behavior has been observed, both
>>>>>   amdgpu and the NVIDIA proprietary driver register backlight handlers.
>>>>> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>>>>>   case observed so far) can successfully control the backlight through
>>>>>   its backlight driver's sysfs interface, but stops working after the
>>>>>   first suspend/resume cycle.
>>>>> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>>>>>   fresh boot, but begins to work after the first suspend/resume cycle.
>>>>> * The GPU which does not have backlight control (NVIDIA in this case)
>>>>>   is not able to control the backlight at any point while the system
>>>>>   is in operation. On similar hybrid systems with an EC-controlled
>>>>>   backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>>>>>   does not register its backlight handler. It has not been determined
>>>>>   whether the non-functional handler registered by the NVIDIA driver
>>>>>   is due to another firmware bug, or a bug in the NVIDIA driver.
>>>>>
>>>>> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
>>>>> device, it takes precedence over the BACKLIGHT_RAW devices registered
>>>>> by the GPU drivers. This in turn leads to backlight control appearing
>>>>> to be non-functional until after completing a suspend/resume cycle.
>>>>> However, it is still possible to control the backlight through direct
>>>>> interaction with the working GPU driver's backlight sysfs interface.
>>>>>
>>>>> These systems also appear to have a second firmware bug which resets
>>>>> the EC's brightness level to 100% on resume, but leaves the state in
>>>>> the kernel at the pre-suspend level. This causes attempts to save
>>>>> and restore the backlight level across the suspend/resume cycle to
>>>>> fail, due to the level appearing not to change even though it did.
>>>>>
>>>>> In order to work around these issues, add a quirk table to detect
>>>>> systems that are known to show these behaviors. So far, there is
>>>>> only one known system that requires these workarounds, and both
>>>>> issues are present on that system, but the quirks are tracked
>>>>> separately to make it easier to add them to other systems which
>>>>> may exhibit one of the bugs, but not the other. The original systems
>>>>> that this driver was tested on during development do not exhibit
>>>>> either of these quirks.
>>>>>
>>>>> If a system with the "GPU driver has backlight control" quirk is
>>>>> detected, nvidia-wmi-ec-backlight will grab a reference to the working
>>>>> (when freshly booted) GPU backlight handler and relays any backlight
>>>>> brightness level change requests directed at the EC to also be applied
>>>>> to the GPU backlight interface. This leads to redundant updates
>>>>> directed at the GPU backlight driver after a suspend/resume cycle, but
>>>>> it does allow the EC backlight control to work when the system is
>>>>> freshly booted.
>>>> Ugh, I'm really not a fan of the backlight proxy plan here. I have
>>>> plans to clean-up the whole x86 backlight mess soon and an important part
>>>> of that is to stop registering multiple backlight interfaces for the
>>>> same panel/screen.
>>>>
>>>> Where as going with this workaround requires us to have 2 active
>>>> backlight interfaces active. Also this will very likely work to
>>>> (subtly) different backlight behavior before and after the first
>>>> suspend/resume.
>>> I understand. Having multiple backlight devices for the same panel is indeed annoying. Out of curiosity, what is the plan for determining that multiple backlight interfaces are all supposed to control the same panel?
>> ATM the kernel basically only supports a bunch of different methods
>> to control the backlight of 1 internal panel. The plan is to tie this
>> to the panel from a userspace pov by making the brightness +
>> max_brightness properties on the drm_connector object for the
>> internal-panel.
>>
>> The in kernel tying of the backlight device to the internal panel
>> will be done hardcoded inside the drm driver(s) based on the
>> drivers already knowing which connector is the internal panel.
> 
> 
> Okay. At the moment the other problem I am thinking about also makes a one-internal-panel assumption, and it's true for the particular hardware that I'm working with, but I didn't like that assumption. If there isn't something existing that can be used to link connectors from multiple GPUs together to indicate they actually (potentially) drive the same display panel, then I suppose I'll continue assuming that. But I really do think the >1 display case needs to be more than just an afterthought, even if the solution, for now, is to just have all the drivers agree that a single internal panel is e.g. index 0 of some shared table that the drivers can all consult so that everybody knows which panel is which.
> 
> I am also concerned about muxed designs that don't have an EC-controlled backlight. If there's only allowed to be one backlight device per panel, then that means vga_switcheroo would have to do something to disable the outgoing GPU's backlight control and enable the incoming one's.

The one backlight device rule mainly applies to the case where the GPU driver's native backlight control is not used, so it needs to find another backlight-device (e.gf. acpi_video or  nvidia-wmi-ec-backlight) and proxy that to the properties on the panel's drm_connector.

If the native backlight control of the 2 GPUs is used then their is no such constriant since the GPU driver in that case knows which backlight-device to use itself.

Currently acpi_video_get_backlight_type() has the following return values:

enum acpi_backlight_type {
        acpi_backlight_undef = -1,
        acpi_backlight_none = 0,
        acpi_backlight_video,
        acpi_backlight_vendor,
        acpi_backlight_native,
};

The idea is to extend this to:

enum acpi_backlight_type {
        acpi_backlight_undef = -1,
        acpi_backlight_none = 0,
        acpi_backlight_video,
        acpi_backlight_vendor,
        acpi_backlight_native,
	acpi_backlight_nvidia_wmi_ec,
	acpi_backlight_apple_gmux,
};

And then have *all* (x86) backlight drivers do:

         if (acpi_video_get_backlight_type() != acpi_backlight_<my_type>)
                 return 0;

before registering the backlight-device. This is currently already
done by the acpi_backlight_video and acpi_backlight_vendor type
drivers, but not by the other ones.

In case of there being 2 native GPIU driver backlight-device (with the active GPU being in actual control) then both GPU drivers will do:

         if (acpi_video_get_backlight_type() != acpi_backlight_native)
                 return 0;

and since acpi_video_get_backlight_type() will return acpi_backlight_native in this case, the if condition will be false and both will continue with registering their (native) backlight device. Each offering the new brightness properties on the drm_connector for *their* internal display connection.

One of those 2 internal-panel drm_connectors should always be in a disconnected state, so this way userspace can choose which one to use (the one which is actually connected).

I believe something similar is already done by userspace on devices like this. For native backlight-devices userspace can find the matching GPU by looking at the parent device and I remember at least discussing userspace checking which GPU is connected to the panel to chose which backlight device to use on systems with 2 native backlight devices.

> And then all the userspace software that controls the sysfs backlight interf	aces would need to make sure to check to see what backlight devices are exposed every time a brightness change request is made, and not e.g. just once at init and then assume that the backlight devices won't change over the lifetime of whatever backlight manager software we're talking about (e.g. gnome-settings-daemon). I suppose if what the kernel exposes is an abstraction so that userspace only sees one backlight interface per panel in sysfs at any given time, which might actually be connected to one of several different drivers in the backlight subsystem, that would make it a little better, but I still think there would
> be potential for races between a mux switch and a brightness change event.

See above in this case there will be 2 separate drm_connector-s and on a dynamic switch userspace will see a display disconnect + reconnect and userspace really needs to see this to be able to setup things like prime properly.

> 
>> This all naively assumes there is only 1 internal panel, which
>> for the majority of cases is true. My plan for devices with
>> 2 internal panels is to cross that bridge when we get there
>> (I expect those mostly in phone/tablet like devices for now
>> which will likely use devicetree where solving this is trivial).
>>
>> I do realize we will eventually get some x86/acpi device with
>> 2 internal panels. Hopefully we can just figure out what
>> the Windows drivers are doing there and parse e.g. the ACPI
>> info which Windows is using for this.
> 
> 
> I'm not aware of any more modern examples, but there already have been such systems. See e.g. Lenovo ThinkPad W701ds. IIRC that system was discrete GPU only, so there wouldn't be any concern about coordinating panel IDs between different GPU drivers, but there is a very real possibility that a vendor may want to bring back a similar design with hybrid GPUs. The asymmetry of that system always bothered me a bit: I imagine that if the NVIDIA GPUs of that era supported more than two heads per GPU, we may have seen a three-screen notebook.
> 
> It's very possible that some of the modern exotic e.g. folding designs show up as dual displays, but I don't have any first-hand experience with those to know whether that is the case. I definitely remember the W701ds exposed the two displays as separate entities. (And the one that popped off to the side was rotated, which was fun.)
> 
> 
>> As part of the move to properties on the drm_connector object
>> the /sys/class/backlight interface will become deprecated,
>> but will be kept for backward compat and will eventually
>> be put behind a Kconfig option.
> 
> 
> If you have a high-level design for this written down somewhere, do you mind sharing it? I want to get an idea of what types of changes the proprietary nvidia-drm driver might need to participate in this.

I don't really have a high level design written down yet.

Basically the plan is to have some generic helper code to allow a GPU driver to proxy a backlight device (including its own native backlight) to a set of properties on the drm_connector object. As mentioned before the idea is to keep the backlight-device API as kernel-internal API for this since we need this for backward compat and we need some sort of internal API between e.g. nvidia-wmi-ec-backligt and the GPU drivers anyways.

It will be the GPU drivers responsibility to pick a backlight-device + connector to pair up. Although I likely will add a helper for that too which
will take the GPU drivers own native backlight-device (may be NULL) + a drm-connector as input and then based on the acpi_video_get_backlight_type() return value find a backlight-device to proxy (iow it may end up not using the native backlight-device even if one is passed in).

I expect the helpers for this to also be suitable for the proprietary nvidia driver, but I guess that also depends on how much of the existing drm-helpers you already use.

>> The kernel internal backlight_device stuff will be kept
>> since we need some internal representation anyways and
>> I don't see much value in reworking that, esp. since
>> we need to have /sys/class/backlight backward compat.
>>
>> Note this is all based on discussions which I had with
>> mainly Daniel Vetter @plumbers 2019 in Lisabon. I have
>> never gotten around to actually start working on this,
>> but this has resurfaced recently and I plan to actually
>> take a stab at implementing this plan sometime during 2022.
>>
>>> I’m not sure I’m aware of any driver-agnostic ways of identifying a particular panel instance uniquely, and if there is one, that would actually help with something else I’m working on at the moment. If the idea involves e.g. the EDID, that could be troublesome for backlight drivers such as this one which aren’t associated with any GPU.
>> Right, see above the main idea is to make this
>> "the kernel's problem" and I expect us to fix this in
>> the kernel in a variety of different ways depending on
>> the actual hardware.
>>
>> As for "troublesome for backlight drivers such as this one
>> which aren’t associated with any GPU.", the idea is that:
>>
>> 1. E.g the i915 driver (which I have the most experience with)
>> knows which connector is the internal panel
> 
> 
> Sure, the NVIDIA proprietary driver knows this as well, and what I've seen from the other DRM drivers also suggests that this is an easy determination to make.
> 
> 
>> 2. The acpi_video_get_backlight_type() helper from
>> drivers/acpi/video_detect.c will get extended to make sure
>> that there is always only *1* /sys/class/backlight device.
>>
>> To be specific atm code supporting old vendor specific backlight
>> fw interfaces, e.g. drivers/platform/x86/dell-laptop.c:
>> already does:
>>
>>         if (acpi_video_get_backlight_type() != acpi_backlight_vendor)
>>                  return 0;
>>
>> And drivers/acpi/acpi_video.c also already does:
>>
>>         if (acpi_video_get_backlight_type() != acpi_backlight_video)
>>                  return 0;
>>
>> Currently looking at the 3 main x86 backlight interfaces: vendor,
>> generic-ACPI and native-drm-driver, only the native driver's
>> backlight registers unconditionally. The plan is to make those also
>> do a similar check (*) and to also add special backlight drivers like
>> nvidia-wmi-ec-backlight and drivers/video/backlight/apple_bl.c
>> to this mechanism.
> 
> 
> nvidia-wmi-ec-backlight does this check during probe:
> 
>>         ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>>                                    WMI_BRIGHTNESS_MODE_GET, &source);
>>         if (ret)
>>                 return ret;
>>
>>         /*
>>          * This driver is only to be used when brightness control is handled
>>          * by the EC; otherwise, the GPU driver(s) should control brightness.
>>          */
>>         if (source != WMI_BRIGHTNESS_SOURCE_EC)
>>                 return -ENODEV;

Right, we will need to do something similar inside drivers/acpi/video_detect.c
in the future to make acpi_video_get_backlight_type() return 
acpi_backlight_nvidia_wmi_ec on systems which use this fw interface.

> 
> 
> And part of the problem with this bug is that what the firmware is reporting here is a lie, at least until the first suspend/resume cycle.
> 
> Also, unlike the DRM drivers, the NVIDIA proprietary driver does not register its backlight handler unconditionally: one of the strange things about this system is that the NVIDIA proprietary driver *does* register a backlight handler, even though it's definitely not supposed to in this case. I did notice that the DRM drivers don't currently seem to have any checks to see whether they should register, and indeed on the EC backlight systems I have tested, with both amdgpu and i915, there is a non-functional sysfs backlight interface exposed by the iGPU. One of the other strange things about the system is the fact that the amdgpu backlight driver works at all.


Right, note that if making the amdgpu driver not register it backlight handler,
then we might add acpi_backlight_nvidia_wmi_ec support to acpi_video_get_backlight_type()
sooner (now) and add a:

         if (acpi_video_get_backlight_type() != acpi_backlight_native)
                 return 0;

to amdgpu now, to fix this.

>> 3. 1 + 2 means that the drm_driver can just tie the single
>> backlight_device which will be registered on the system to
>> the internal panel.
> 
> This sounds kind of ugly to me. If the backlight control is GPU-agnostic, as is the case here, then associating the backlight_device with a drm_driver doesn't seem right. It sounds like the plan is to move from the current sysfs interface to one that's exposed by the DRM subsystem. If that's the case, I think there should be a way to do so without tying it to a specific GPU driver. e.g., on a muxed system, one or the other GPU might be scanning out at any given time (which is why many of these systems have non-GPU backlight control), so if the interface is GPU-specific, that would be at the very least confusing.

Most muxed systems that I know don't support dynamic switching, so we would tie e.g.
the nvidia-wmi-ec-backlight device to the internal-panel drm_connector of the GPU
which actually is connected to the panel.

For dynamic switching the drm_connector_s_ for the internal panel would both
proxy the nvidia-wmi-ec-backlight device and userspace will use the one for
connector which is actually connected at a given time.

>> Again I'm completely ignoring dual-internal-panel devices here
>> for simplicity's sake.
>>
>> Note this is getting a bit off-topic, but if you have insights
>> in this, or already can think of ways how this is not going to
>> work :)  please let me know.
> 
> 
> My concerns listed above are everything I can think of off the top of my head. I'm sure if I think about it more I'll come up with other concerns. This did get a bit off-topic, but I'm actually very glad you've told me all this.

Yeah it is good to discuss this now. It will likely be a couple
of months before I can actually post a RFC series implemeting
my ideas for this.

Regards,

Hans


> 
> 
>>
>> *) And adding that check + the presence of nvidia-wmi-ec-backlight
>> support will make the native drm-driver not register it's
>> backlight_device at all at which point the backlight-proxy workaround
>> from this patch breaks.
>>
>>
>>> This also gives me an idea for another experiment I didn’t think to try earlier. Alex: what happens if you hack amdgpu_atombios_encoder_init_backlight() in the amdgpu driver to just return right away? I wonder if the AMD GPU’s attempt to take over backlight control is what makes the firmware give control to the GPU rather than the EC initially.
>>>
>>> Regardless of the backlight proxy workaround, does the force refresh one seem reasonable? That one at least addresses a condition that happens at every suspend/resume cycle.
>> Good question, I must admit I stopped reading the patch after seeing
>> the proxy thing.
>>
>> I see that you are using a pm_notifer for this. I wonder if you
>> can try (on your own system) to add a pm_ops struct and make
>> wmi_driver.driver.pm point to that and check if that gets called
>> by adding e.g. a pr_info (I don't see why it would not get called).
>>
>> And assuming that works, using that would be a bit cleaner IMHO.
>> Although that does have resume-ordering implications. But I would
>> expect the EC to basically be always ready to get talked to at
>> the point in the resume cycle where normal (non early) resume
>> handlers are called.
>>
>> To be clear the idea would be to always have the suspend handler
>> (so that the driver and pm_ops structs can be const) and to check
>> a quirk flag inside the resume handler. Or maybe even just always
>> read back the brightness from the hw and check if it has changed?
>> Does this need to be behind a quirk ?
> 
> 
> Okay, thanks. I missed the wmi_driver.driver.pm when I was looking to wire up something to refresh the backlight level on resume; that does sound cleaner than registering a notifier. Alex did report that the backlight was briefly at maximum brightness following resume, before the notifier kicked in and reset it to the correct level, so hopefully hooking it up through pm_ops will be able to catch it early enough to avoid that. And making the refresh unconditional rather than hidden behind a quirk sounds reasonable too; I expect it to be harmless on systems that don't need it.
> 
> 
>>>> Is there no other way to solve this issue? Maybe we need to poke
>>>> vgaswitcheroo to set the current GPU mode even though this is
>>>> already reported as active to get things to switch to the ECs
>>>> control right away ?
>>> There isn’t a vgaswitcheroo handler for this particular mux device (yet), but there are separate ACPI interfaces for the mux itself. Poking the mux *shouldn’t* have any effect on what device is controlling the backlight for the panel, since when the system is in dynamic mux mode the EC is always supposed to be in control, but this system is already showing some weird behavior, so it doesn’t hurt to try.
>> Right, as you said the EC is always supposed to be in control, but
>> it is not. I would not be surprised if making the ACPI call to put
>> things in dynamic mode (even though they already are) fixes this,
>> assuming there is such an ACPI call...
> 
> 
> Yes, there is one, although changes to the mode aren't supposed to take effect until reboot. However, this system already seems to be misbehaving, so it's certainly possible that poking at that call will do something.
> 
> 
>>>> I'm pretty certain that Windows is not doing this backlight proxying,
>>>> IMHO we need to figure out what causes the switch after suspend/resume
>>>> and then do that thing at boot.
>>> I’ll put together an instrumented driver for Alex to try on his system, to capture some more data and see if I can get some more insight into that. I also have a dump of his ACPI tables, and can check there for some other potential leads. Hopefully whatever changes the state across the suspend/resume cycle is response to something that Linux does or doesn’t do, and not some state that is only handled internally within the firmware.
>> Great, thank you.
>>
>> Regards,
>>
>> Hans
>>
>>
>>
>>>>> If a system with the "backlight level reset to full on resume" quirk
>>>>> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
>>>>> reset the backlight to the previous level upon resume.
>>>>>
>>>>> These workarounds are also plumbed through to kernel module parameters,
>>>>> to make it easier for users who suspect they may be affected by one or
>>>>> both of these bugs to test whether these workarounds are effective on
>>>>> their systems as well.
>>>>>
>>>>> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
>>>>> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
>>>>> ---
>>>>> Note: the Tested-by: line above applies to the previous version of this
>>>>> patch; an explicit ACK from the tester is required for it to apply to
>>>>> the current version.
>>>>>
>>>>> v2:
>>>>> * Add readable sysfs files for module params, use linear interpolation
>>>>>    from fixp-arith.h, fix return value of notifier callback, use devm_*()
>>>>>    for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
>>>>> * Add comment to denote known firmware versions that exhibit the bugs.
>>>>>    (Mario Limonciello <Mario.Limonciello@amd.com>)
>>>>> * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)
>>>>>
>>>>> .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
>>>>> 1 file changed, 194 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>>>> index 61e37194df70..95e1ddf780fc 100644
>>>>> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>>>> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
>>>>> @@ -3,8 +3,12 @@
>>>>>   * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
>>>>>   */
>>>>>
>>>>> +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
>>>>> +
>>>>> #include <linux/acpi.h>
>>>>> #include <linux/backlight.h>
>>>>> +#include <linux/dmi.h>
>>>>> +#include <linux/fixp-arith.h>
>>>>> #include <linux/mod_devicetable.h>
>>>>> #include <linux/module.h>
>>>>> #include <linux/types.h>
>>>>> @@ -75,6 +79,73 @@ struct wmi_brightness_args {
>>>>>     u32 ignored[3];
>>>>> };
>>>>>
>>>>> +/**
>>>>> + * struct nvidia_wmi_ec_backlight_priv - driver private data
>>>>> + * @bl_dev:       the associated backlight device
>>>>> + * @proxy_target: backlight device which receives relayed brightness changes
>>>>> + * @notifier:     notifier block for resume callback
>>>>> + */
>>>>> +struct nvidia_wmi_ec_backlight_priv {
>>>>> +    struct backlight_device *bl_dev;
>>>>> +    struct backlight_device *proxy_target;
>>>>> +    struct notifier_block nb;
>>>>> +};
>>>>> +
>>>>> +static char *backlight_proxy_target;
>>>>> +module_param(backlight_proxy_target, charp, 0444);
>>>>> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
>>>>> +
>>>>> +static int max_reprobe_attempts = 128;
>>>>> +module_param(max_reprobe_attempts, int, 0444);
>>>>> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
>>>>> +
>>>>> +static bool restore_level_on_resume;
>>>>> +module_param(restore_level_on_resume, bool, 0444);
>>>>> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
>>>>> +
>>>>> +/* Bit field values for quirks table */
>>>>> +
>>>>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
>>>>> +
>>>>> +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
>>>>> +
>>>>> +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
>>>>> +
>>>>> +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
>>>>> +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
>>>>> +
>>>>> +static int assign_quirks(const struct dmi_system_id *id)
>>>>> +{
>>>>> +    if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
>>>>> +        restore_level_on_resume = 1;
>>>>> +
>>>>> +    /* If the module parameter is set, override the quirks table */
>>>>> +    if (!backlight_proxy_target) {
>>>>> +        if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
>>>>> +            backlight_proxy_target = "amdgpu_bl1";
>>>>> +    }
>>>>> +
>>>>> +    return true;
>>>>> +}
>>>>> +
>>>>> +#define QUIRK_ENTRY(vendor, product, quirks) {          \
>>>>> +    .callback = assign_quirks,                      \
>>>>> +    .matches = {                                    \
>>>>> +        DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
>>>>> +        DMI_MATCH(DMI_PRODUCT_VERSION, product) \
>>>>> +    },                                              \
>>>>> +    .driver_data = (void *)(quirks)                 \
>>>>> +}
>>>>> +
>>>>> +static const struct dmi_system_id quirks_table[] = {
>>>>> +    QUIRK_ENTRY(
>>>>> +        /* This quirk is preset as of firmware revision HACN31WW */
>>>>> +        "LENOVO", "Legion S7 15ACH6",
>>>>> +        QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
>>>>> +    ),
>>>>> +    { }
>>>>> +};
>>>>> +
>>>>> /**
>>>>>   * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
>>>>>   * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
>>>>> @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
>>>>>     return 0;
>>>>> }
>>>>>
>>>>> +/* Scale the current brightness level of 'from' to the range of 'to'. */
>>>>> +static int scale_backlight_level(const struct backlight_device *from,
>>>>> +                 const struct backlight_device *to)
>>>>> +{
>>>>> +    int from_max = from->props.max_brightness;
>>>>> +    int from_level = from->props.brightness;
>>>>> +    int to_max = to->props.max_brightness;
>>>>> +
>>>>> +    return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
>>>>> +}
>>>>> +
>>>>> static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
>>>>> {
>>>>>     struct wmi_device *wdev = bl_get_data(bd);
>>>>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>>>>> +    struct backlight_device *proxy_target = priv->proxy_target;
>>>>> +
>>>>> +    if (proxy_target) {
>>>>> +        int level = scale_backlight_level(bd, proxy_target);
>>>>> +
>>>>> +        if (backlight_device_set_brightness(proxy_target, level))
>>>>> +            pr_warn("Failed to relay backlight update to \"%s\"",
>>>>> +                backlight_proxy_target);
>>>>> +    }
>>>>>
>>>>>     return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
>>>>>                                  WMI_BRIGHTNESS_MODE_SET,
>>>>> @@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
>>>>>     .get_brightness = nvidia_wmi_ec_backlight_get_brightness,
>>>>> };
>>>>>
>>>>> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
>>>>> +{
>>>>> +
>>>>> +    /*
>>>>> +     * On some systems, the EC backlight level gets reset to 100% when
>>>>> +     * resuming from suspend, but the backlight device state still reflects
>>>>> +     * the pre-suspend value. Refresh the existing state to sync the EC's
>>>>> +     * state back up with the kernel's.
>>>>> +     */
>>>>> +    if (event == PM_POST_SUSPEND) {
>>>>> +        struct nvidia_wmi_ec_backlight_priv *p;
>>>>> +        int ret;
>>>>> +
>>>>> +        p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
>>>>> +        ret = backlight_update_status(p->bl_dev);
>>>>> +
>>>>> +        if (ret)
>>>>> +            pr_warn("failed to refresh backlight level: %d", ret);
>>>>> +
>>>>> +        return NOTIFY_OK;
>>>>> +    }
>>>>> +
>>>>> +    return NOTIFY_DONE;
>>>>> +}
>>>>> +
>>>>> +static void putdev(void *data)
>>>>> +{
>>>>> +    struct device *dev = data;
>>>>> +
>>>>> +    put_device(dev);
>>>>> +}
>>>>> +
>>>>> static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
>>>>> {
>>>>> +    struct backlight_device *bdev, *target = NULL;
>>>>> +    struct nvidia_wmi_ec_backlight_priv *priv;
>>>>>     struct backlight_properties props = {};
>>>>> -    struct backlight_device *bdev;
>>>>>     u32 source;
>>>>>     int ret;
>>>>>
>>>>> +    /*
>>>>> +     * Check quirks tables to see if this system needs any of the firmware
>>>>> +     * bug workarounds.
>>>>> +     */
>>>>> +    dmi_check_system(quirks_table);
>>>>> +
>>>>> +    if (backlight_proxy_target && backlight_proxy_target[0]) {
>>>>> +        static int num_reprobe_attempts;
>>>>> +
>>>>> +        target = backlight_device_get_by_name(backlight_proxy_target);
>>>>> +
>>>>> +        if (target) {
>>>>> +            ret = devm_add_action_or_reset(&wdev->dev, putdev,
>>>>> +                               &target->dev);
>>>>> +            if (ret)
>>>>> +                return ret;
>>>>> +        } else {
>>>>> +            /*
>>>>> +             * The target backlight device might not be ready;
>>>>> +             * try again and disable backlight proxying if it
>>>>> +             * fails too many times.
>>>>> +             */
>>>>> +            if (num_reprobe_attempts < max_reprobe_attempts) {
>>>>> +                num_reprobe_attempts++;
>>>>> +                return -EPROBE_DEFER;
>>>>> +            }
>>>>> +
>>>>> +            pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
>>>>> +                backlight_proxy_target, max_reprobe_attempts);
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>>     ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>>>>>                                WMI_BRIGHTNESS_MODE_GET, &source);
>>>>>     if (ret)
>>>>> @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
>>>>>                           &wdev->dev, wdev,
>>>>>                           &nvidia_wmi_ec_backlight_ops,
>>>>>                           &props);
>>>>> -    return PTR_ERR_OR_ZERO(bdev);
>>>>> +
>>>>> +    if (IS_ERR(bdev))
>>>>> +        return PTR_ERR(bdev);
>>>>> +
>>>>> +    priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
>>>>> +    if (!priv)
>>>>> +        return -ENOMEM;
>>>>> +
>>>>> +    priv->bl_dev = bdev;
>>>>> +
>>>>> +    dev_set_drvdata(&wdev->dev, priv);
>>>>> +
>>>>> +    if (target) {
>>>>> +        int level = scale_backlight_level(target, bdev);
>>>>> +
>>>>> +        if (backlight_device_set_brightness(bdev, level))
>>>>> +            pr_warn("Unable to import initial brightness level from %s.",
>>>>> +                backlight_proxy_target);
>>>>> +        priv->proxy_target = target;
>>>>> +    }
>>>>> +
>>>>> +    if (restore_level_on_resume) {
>>>>> +        priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
>>>>> +        register_pm_notifier(&priv->nb);
>>>>> +    }
>>>>> +
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
>>>>> +{
>>>>> +    struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
>>>>> +
>>>>> +    if (priv->nb.notifier_call)
>>>>> +        unregister_pm_notifier(&priv->nb);
>>>>> }
>>>>>
>>>>> #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
>>>>> @@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
>>>>>         .name = "nvidia-wmi-ec-backlight",
>>>>>     },
>>>>>     .probe = nvidia_wmi_ec_backlight_probe,
>>>>> +    .remove = nvidia_wmi_ec_backlight_remove,
>>>>>     .id_table = nvidia_wmi_ec_backlight_id_table,
>>>>> };
>>>>> module_wmi_driver(nvidia_wmi_ec_backlight_driver);
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2022-03-16 22:09         ` Alexandru Dinu
  2022-03-16 22:14           ` Alexandru Dinu
@ 2023-01-30 22:00           ` Daniel Dadap
  2023-01-31 19:56             ` Alexandru Dinu
  1 sibling, 1 reply; 31+ messages in thread
From: Daniel Dadap @ 2023-01-30 22:00 UTC (permalink / raw)
  To: Alexandru Dinu
  Cc: platform-driver-x86, Barnabás Pőcze, Hans de Goede,
	markgross, Limonciello, Mario, Deucher, Alexander

Hi Alex,

On Thu, Mar 17, 2022 at 12:09:03AM +0200, Alexandru Dinu wrote:
> > Note: the Tested-by: line above applies to the previous version of this
> > patch; an explicit ACK from the tester is required for it to apply to
> > the current version.
> 
> I compiled and tested v2 on 5.16.14.
> Everything works as expected: brightness control & level restore work
> both on first boot and on subsequent sleep/resume cycles.

I ended up abandoning this workaround patch because it was incompatible
with Hans's plan to clean up the backlight subsystem. In the meantime,
somebody else reported a similar issue recently which appears to be
resolved by updating to the latest firmware version. Have you updated to
the most recent firmware, and if so, are you still seeing this issue?

> Regards,
> Alex
> 
> 
> 
> On Wed, 16 Mar 2022 at 23:28, Daniel Dadap <ddadap@nvidia.com> wrote:
> >
> > Sorry, just noticed a typo in a comment:
> >
> > /* This quirk is preset as of firmware revision HACN31WW */
> >
> > Obviously that is meant to read "present". I'll fix that with the next
> > round of changes, assuming there will be additional review feedback.
> >
> > On 3/16/22 15:33, Daniel Dadap wrote:
> > > Some notebook systems with EC-driven backlight control appear to have a
> > > firmware bug which causes the system to use GPU-driven backlight control
> > > upon a fresh boot, but then switches to EC-driven backlight control
> > > after completing a suspend/resume cycle. All the while, the firmware
> > > reports that the backlight is under EC control, regardless of what is
> > > actually controlling the backlight brightness.
> > >
> > > This leads to the following behavior:
> > >
> > > * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
> > >    WMI-wrapped ACPI method erroneously reporting EC control.
> > > * nvidia-wmi-ec-backlight does not work until after a suspend/resume
> > >    cycle, due to the backlight control actually being GPU-driven.
> > > * GPU drivers also register their own backlight handlers: in the case
> > >    of the notebook system where this behavior has been observed, both
> > >    amdgpu and the NVIDIA proprietary driver register backlight handlers.
> > > * The GPU which has backlight control upon a fresh boot (amdgpu in the
> > >    case observed so far) can successfully control the backlight through
> > >    its backlight driver's sysfs interface, but stops working after the
> > >    first suspend/resume cycle.
> > > * nvidia-wmi-ec-backlight is unable to control the backlight upon a
> > >    fresh boot, but begins to work after the first suspend/resume cycle.
> > > * The GPU which does not have backlight control (NVIDIA in this case)
> > >    is not able to control the backlight at any point while the system
> > >    is in operation. On similar hybrid systems with an EC-controlled
> > >    backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
> > >    does not register its backlight handler. It has not been determined
> > >    whether the non-functional handler registered by the NVIDIA driver
> > >    is due to another firmware bug, or a bug in the NVIDIA driver.
> > >
> > > Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
> > > device, it takes precedence over the BACKLIGHT_RAW devices registered
> > > by the GPU drivers. This in turn leads to backlight control appearing
> > > to be non-functional until after completing a suspend/resume cycle.
> > > However, it is still possible to control the backlight through direct
> > > interaction with the working GPU driver's backlight sysfs interface.
> > >
> > > These systems also appear to have a second firmware bug which resets
> > > the EC's brightness level to 100% on resume, but leaves the state in
> > > the kernel at the pre-suspend level. This causes attempts to save
> > > and restore the backlight level across the suspend/resume cycle to
> > > fail, due to the level appearing not to change even though it did.
> > >
> > > In order to work around these issues, add a quirk table to detect
> > > systems that are known to show these behaviors. So far, there is
> > > only one known system that requires these workarounds, and both
> > > issues are present on that system, but the quirks are tracked
> > > separately to make it easier to add them to other systems which
> > > may exhibit one of the bugs, but not the other. The original systems
> > > that this driver was tested on during development do not exhibit
> > > either of these quirks.
> > >
> > > If a system with the "GPU driver has backlight control" quirk is
> > > detected, nvidia-wmi-ec-backlight will grab a reference to the working
> > > (when freshly booted) GPU backlight handler and relays any backlight
> > > brightness level change requests directed at the EC to also be applied
> > > to the GPU backlight interface. This leads to redundant updates
> > > directed at the GPU backlight driver after a suspend/resume cycle, but
> > > it does allow the EC backlight control to work when the system is
> > > freshly booted.
> > >
> > > If a system with the "backlight level reset to full on resume" quirk
> > > is detected, nvidia-wmi-ec-backlight will register a PM notifier to
> > > reset the backlight to the previous level upon resume.
> > >
> > > These workarounds are also plumbed through to kernel module parameters,
> > > to make it easier for users who suspect they may be affected by one or
> > > both of these bugs to test whether these workarounds are effective on
> > > their systems as well.
> > >
> > > Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
> > > Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
> > > ---
> > > Note: the Tested-by: line above applies to the previous version of this
> > > patch; an explicit ACK from the tester is required for it to apply to
> > > the current version.
> > >
> > > v2:
> > >   * Add readable sysfs files for module params, use linear interpolation
> > >     from fixp-arith.h, fix return value of notifier callback, use devm_*()
> > >     for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
> > >   * Add comment to denote known firmware versions that exhibit the bugs.
> > >     (Mario Limonciello <Mario.Limonciello@amd.com>)
> > >   * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)
> > >
> > >   .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
> > >   1 file changed, 194 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> > > index 61e37194df70..95e1ddf780fc 100644
> > > --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> > > +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> > > @@ -3,8 +3,12 @@
> > >    * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
> > >    */
> > >
> > > +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
> > > +
> > >   #include <linux/acpi.h>
> > >   #include <linux/backlight.h>
> > > +#include <linux/dmi.h>
> > > +#include <linux/fixp-arith.h>
> > >   #include <linux/mod_devicetable.h>
> > >   #include <linux/module.h>
> > >   #include <linux/types.h>
> > > @@ -75,6 +79,73 @@ struct wmi_brightness_args {
> > >       u32 ignored[3];
> > >   };
> > >
> > > +/**
> > > + * struct nvidia_wmi_ec_backlight_priv - driver private data
> > > + * @bl_dev:       the associated backlight device
> > > + * @proxy_target: backlight device which receives relayed brightness changes
> > > + * @notifier:     notifier block for resume callback
> > > + */
> > > +struct nvidia_wmi_ec_backlight_priv {
> > > +     struct backlight_device *bl_dev;
> > > +     struct backlight_device *proxy_target;
> > > +     struct notifier_block nb;
> > > +};
> > > +
> > > +static char *backlight_proxy_target;
> > > +module_param(backlight_proxy_target, charp, 0444);
> > > +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
> > > +
> > > +static int max_reprobe_attempts = 128;
> > > +module_param(max_reprobe_attempts, int, 0444);
> > > +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
> > > +
> > > +static bool restore_level_on_resume;
> > > +module_param(restore_level_on_resume, bool, 0444);
> > > +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
> > > +
> > > +/* Bit field values for quirks table */
> > > +
> > > +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
> > > +
> > > +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
> > > +
> > > +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
> > > +
> > > +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
> > > +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
> > > +
> > > +static int assign_quirks(const struct dmi_system_id *id)
> > > +{
> > > +     if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
> > > +             restore_level_on_resume = 1;
> > > +
> > > +     /* If the module parameter is set, override the quirks table */
> > > +     if (!backlight_proxy_target) {
> > > +             if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
> > > +                     backlight_proxy_target = "amdgpu_bl1";
> > > +     }
> > > +
> > > +     return true;
> > > +}
> > > +
> > > +#define QUIRK_ENTRY(vendor, product, quirks) {          \
> > > +     .callback = assign_quirks,                      \
> > > +     .matches = {                                    \
> > > +             DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
> > > +             DMI_MATCH(DMI_PRODUCT_VERSION, product) \
> > > +     },                                              \
> > > +     .driver_data = (void *)(quirks)                 \
> > > +}
> > > +
> > > +static const struct dmi_system_id quirks_table[] = {
> > > +     QUIRK_ENTRY(
> > > +             /* This quirk is preset as of firmware revision HACN31WW */
> > > +             "LENOVO", "Legion S7 15ACH6",
> > > +             QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
> > > +     ),
> > > +     { }
> > > +};
> > > +
> > >   /**
> > >    * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
> > >    * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
> > > @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
> > >       return 0;
> > >   }
> > >
> > > +/* Scale the current brightness level of 'from' to the range of 'to'. */
> > > +static int scale_backlight_level(const struct backlight_device *from,
> > > +                              const struct backlight_device *to)
> > > +{
> > > +     int from_max = from->props.max_brightness;
> > > +     int from_level = from->props.brightness;
> > > +     int to_max = to->props.max_brightness;
> > > +
> > > +     return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
> > > +}
> > > +
> > >   static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
> > >   {
> > >       struct wmi_device *wdev = bl_get_data(bd);
> > > +     struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> > > +     struct backlight_device *proxy_target = priv->proxy_target;
> > > +
> > > +     if (proxy_target) {
> > > +             int level = scale_backlight_level(bd, proxy_target);
> > > +
> > > +             if (backlight_device_set_brightness(proxy_target, level))
> > > +                     pr_warn("Failed to relay backlight update to \"%s\"",
> > > +                             backlight_proxy_target);
> > > +     }
> > >
> > >       return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
> > >                                    WMI_BRIGHTNESS_MODE_SET,
> > > @@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
> > >       .get_brightness = nvidia_wmi_ec_backlight_get_brightness,
> > >   };
> > >
> > > +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
> > > +{
> > > +
> > > +     /*
> > > +      * On some systems, the EC backlight level gets reset to 100% when
> > > +      * resuming from suspend, but the backlight device state still reflects
> > > +      * the pre-suspend value. Refresh the existing state to sync the EC's
> > > +      * state back up with the kernel's.
> > > +      */
> > > +     if (event == PM_POST_SUSPEND) {
> > > +             struct nvidia_wmi_ec_backlight_priv *p;
> > > +             int ret;
> > > +
> > > +             p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
> > > +             ret = backlight_update_status(p->bl_dev);
> > > +
> > > +             if (ret)
> > > +                     pr_warn("failed to refresh backlight level: %d", ret);
> > > +
> > > +             return NOTIFY_OK;
> > > +     }
> > > +
> > > +     return NOTIFY_DONE;
> > > +}
> > > +
> > > +static void putdev(void *data)
> > > +{
> > > +     struct device *dev = data;
> > > +
> > > +     put_device(dev);
> > > +}
> > > +
> > >   static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
> > >   {
> > > +     struct backlight_device *bdev, *target = NULL;
> > > +     struct nvidia_wmi_ec_backlight_priv *priv;
> > >       struct backlight_properties props = {};
> > > -     struct backlight_device *bdev;
> > >       u32 source;
> > >       int ret;
> > >
> > > +     /*
> > > +      * Check quirks tables to see if this system needs any of the firmware
> > > +      * bug workarounds.
> > > +      */
> > > +     dmi_check_system(quirks_table);
> > > +
> > > +     if (backlight_proxy_target && backlight_proxy_target[0]) {
> > > +             static int num_reprobe_attempts;
> > > +
> > > +             target = backlight_device_get_by_name(backlight_proxy_target);
> > > +
> > > +             if (target) {
> > > +                     ret = devm_add_action_or_reset(&wdev->dev, putdev,
> > > +                                                    &target->dev);
> > > +                     if (ret)
> > > +                             return ret;
> > > +             } else {
> > > +                     /*
> > > +                      * The target backlight device might not be ready;
> > > +                      * try again and disable backlight proxying if it
> > > +                      * fails too many times.
> > > +                      */
> > > +                     if (num_reprobe_attempts < max_reprobe_attempts) {
> > > +                             num_reprobe_attempts++;
> > > +                             return -EPROBE_DEFER;
> > > +                     }
> > > +
> > > +                     pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
> > > +                             backlight_proxy_target, max_reprobe_attempts);
> > > +             }
> > > +     }
> > > +
> > >       ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
> > >                                  WMI_BRIGHTNESS_MODE_GET, &source);
> > >       if (ret)
> > > @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
> > >                                             &wdev->dev, wdev,
> > >                                             &nvidia_wmi_ec_backlight_ops,
> > >                                             &props);
> > > -     return PTR_ERR_OR_ZERO(bdev);
> > > +
> > > +     if (IS_ERR(bdev))
> > > +             return PTR_ERR(bdev);
> > > +
> > > +     priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
> > > +     if (!priv)
> > > +             return -ENOMEM;
> > > +
> > > +     priv->bl_dev = bdev;
> > > +
> > > +     dev_set_drvdata(&wdev->dev, priv);
> > > +
> > > +     if (target) {
> > > +             int level = scale_backlight_level(target, bdev);
> > > +
> > > +             if (backlight_device_set_brightness(bdev, level))
> > > +                     pr_warn("Unable to import initial brightness level from %s.",
> > > +                             backlight_proxy_target);
> > > +             priv->proxy_target = target;
> > > +     }
> > > +
> > > +     if (restore_level_on_resume) {
> > > +             priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
> > > +             register_pm_notifier(&priv->nb);
> > > +     }
> > > +
> > > +     return 0;
> > > +}
> > > +
> > > +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
> > > +{
> > > +     struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> > > +
> > > +     if (priv->nb.notifier_call)
> > > +             unregister_pm_notifier(&priv->nb);
> > >   }
> > >
> > >   #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
> > > @@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
> > >               .name = "nvidia-wmi-ec-backlight",
> > >       },
> > >       .probe = nvidia_wmi_ec_backlight_probe,
> > > +     .remove = nvidia_wmi_ec_backlight_remove,
> > >       .id_table = nvidia_wmi_ec_backlight_id_table,
> > >   };
> > >   module_wmi_driver(nvidia_wmi_ec_backlight_driver);

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2023-01-30 22:00           ` Daniel Dadap
@ 2023-01-31 19:56             ` Alexandru Dinu
  2023-02-07 23:23               ` Daniel Dadap
  0 siblings, 1 reply; 31+ messages in thread
From: Alexandru Dinu @ 2023-01-31 19:56 UTC (permalink / raw)
  To: Daniel Dadap
  Cc: platform-driver-x86, Barnabás Pőcze, Hans de Goede,
	markgross, Limonciello, Mario, Deucher, Alexander

Hello,

I updated from HACN31WW to the latest HACN39WW and tested on a fresh
6.1.7 kernel.
The brightness issue is not fixed -- same old behaviour: controls only
work after an initial resume from suspend.

So far, I've been using your patched version of the
nvidia-wmi-ec-backlight module which worked great -- manually patching
the module after each kernel update.

> somebody else reported a similar issue recently which appears to be resolved by updating to the latest firmware version.

Can you please point me to this reference?

Thank you!


On Tue, 31 Jan 2023 at 00:00, Daniel Dadap <ddadap@nvidia.com> wrote:
>
> Hi Alex,
>
> On Thu, Mar 17, 2022 at 12:09:03AM +0200, Alexandru Dinu wrote:
> > > Note: the Tested-by: line above applies to the previous version of this
> > > patch; an explicit ACK from the tester is required for it to apply to
> > > the current version.
> >
> > I compiled and tested v2 on 5.16.14.
> > Everything works as expected: brightness control & level restore work
> > both on first boot and on subsequent sleep/resume cycles.
>
> I ended up abandoning this workaround patch because it was incompatible
> with Hans's plan to clean up the backlight subsystem. In the meantime,
> somebody else reported a similar issue recently which appears to be
> resolved by updating to the latest firmware version. Have you updated to
> the most recent firmware, and if so, are you still seeing this issue?
>
> > Regards,
> > Alex
> >
> >
> >
> > On Wed, 16 Mar 2022 at 23:28, Daniel Dadap <ddadap@nvidia.com> wrote:
> > >
> > > Sorry, just noticed a typo in a comment:
> > >
> > > /* This quirk is preset as of firmware revision HACN31WW */
> > >
> > > Obviously that is meant to read "present". I'll fix that with the next
> > > round of changes, assuming there will be additional review feedback.
> > >
> > > On 3/16/22 15:33, Daniel Dadap wrote:
> > > > Some notebook systems with EC-driven backlight control appear to have a
> > > > firmware bug which causes the system to use GPU-driven backlight control
> > > > upon a fresh boot, but then switches to EC-driven backlight control
> > > > after completing a suspend/resume cycle. All the while, the firmware
> > > > reports that the backlight is under EC control, regardless of what is
> > > > actually controlling the backlight brightness.
> > > >
> > > > This leads to the following behavior:
> > > >
> > > > * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
> > > >    WMI-wrapped ACPI method erroneously reporting EC control.
> > > > * nvidia-wmi-ec-backlight does not work until after a suspend/resume
> > > >    cycle, due to the backlight control actually being GPU-driven.
> > > > * GPU drivers also register their own backlight handlers: in the case
> > > >    of the notebook system where this behavior has been observed, both
> > > >    amdgpu and the NVIDIA proprietary driver register backlight handlers.
> > > > * The GPU which has backlight control upon a fresh boot (amdgpu in the
> > > >    case observed so far) can successfully control the backlight through
> > > >    its backlight driver's sysfs interface, but stops working after the
> > > >    first suspend/resume cycle.
> > > > * nvidia-wmi-ec-backlight is unable to control the backlight upon a
> > > >    fresh boot, but begins to work after the first suspend/resume cycle.
> > > > * The GPU which does not have backlight control (NVIDIA in this case)
> > > >    is not able to control the backlight at any point while the system
> > > >    is in operation. On similar hybrid systems with an EC-controlled
> > > >    backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
> > > >    does not register its backlight handler. It has not been determined
> > > >    whether the non-functional handler registered by the NVIDIA driver
> > > >    is due to another firmware bug, or a bug in the NVIDIA driver.
> > > >
> > > > Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
> > > > device, it takes precedence over the BACKLIGHT_RAW devices registered
> > > > by the GPU drivers. This in turn leads to backlight control appearing
> > > > to be non-functional until after completing a suspend/resume cycle.
> > > > However, it is still possible to control the backlight through direct
> > > > interaction with the working GPU driver's backlight sysfs interface.
> > > >
> > > > These systems also appear to have a second firmware bug which resets
> > > > the EC's brightness level to 100% on resume, but leaves the state in
> > > > the kernel at the pre-suspend level. This causes attempts to save
> > > > and restore the backlight level across the suspend/resume cycle to
> > > > fail, due to the level appearing not to change even though it did.
> > > >
> > > > In order to work around these issues, add a quirk table to detect
> > > > systems that are known to show these behaviors. So far, there is
> > > > only one known system that requires these workarounds, and both
> > > > issues are present on that system, but the quirks are tracked
> > > > separately to make it easier to add them to other systems which
> > > > may exhibit one of the bugs, but not the other. The original systems
> > > > that this driver was tested on during development do not exhibit
> > > > either of these quirks.
> > > >
> > > > If a system with the "GPU driver has backlight control" quirk is
> > > > detected, nvidia-wmi-ec-backlight will grab a reference to the working
> > > > (when freshly booted) GPU backlight handler and relays any backlight
> > > > brightness level change requests directed at the EC to also be applied
> > > > to the GPU backlight interface. This leads to redundant updates
> > > > directed at the GPU backlight driver after a suspend/resume cycle, but
> > > > it does allow the EC backlight control to work when the system is
> > > > freshly booted.
> > > >
> > > > If a system with the "backlight level reset to full on resume" quirk
> > > > is detected, nvidia-wmi-ec-backlight will register a PM notifier to
> > > > reset the backlight to the previous level upon resume.
> > > >
> > > > These workarounds are also plumbed through to kernel module parameters,
> > > > to make it easier for users who suspect they may be affected by one or
> > > > both of these bugs to test whether these workarounds are effective on
> > > > their systems as well.
> > > >
> > > > Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
> > > > Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
> > > > ---
> > > > Note: the Tested-by: line above applies to the previous version of this
> > > > patch; an explicit ACK from the tester is required for it to apply to
> > > > the current version.
> > > >
> > > > v2:
> > > >   * Add readable sysfs files for module params, use linear interpolation
> > > >     from fixp-arith.h, fix return value of notifier callback, use devm_*()
> > > >     for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
> > > >   * Add comment to denote known firmware versions that exhibit the bugs.
> > > >     (Mario Limonciello <Mario.Limonciello@amd.com>)
> > > >   * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)
> > > >
> > > >   .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
> > > >   1 file changed, 194 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> > > > index 61e37194df70..95e1ddf780fc 100644
> > > > --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> > > > +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> > > > @@ -3,8 +3,12 @@
> > > >    * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
> > > >    */
> > > >
> > > > +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
> > > > +
> > > >   #include <linux/acpi.h>
> > > >   #include <linux/backlight.h>
> > > > +#include <linux/dmi.h>
> > > > +#include <linux/fixp-arith.h>
> > > >   #include <linux/mod_devicetable.h>
> > > >   #include <linux/module.h>
> > > >   #include <linux/types.h>
> > > > @@ -75,6 +79,73 @@ struct wmi_brightness_args {
> > > >       u32 ignored[3];
> > > >   };
> > > >
> > > > +/**
> > > > + * struct nvidia_wmi_ec_backlight_priv - driver private data
> > > > + * @bl_dev:       the associated backlight device
> > > > + * @proxy_target: backlight device which receives relayed brightness changes
> > > > + * @notifier:     notifier block for resume callback
> > > > + */
> > > > +struct nvidia_wmi_ec_backlight_priv {
> > > > +     struct backlight_device *bl_dev;
> > > > +     struct backlight_device *proxy_target;
> > > > +     struct notifier_block nb;
> > > > +};
> > > > +
> > > > +static char *backlight_proxy_target;
> > > > +module_param(backlight_proxy_target, charp, 0444);
> > > > +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
> > > > +
> > > > +static int max_reprobe_attempts = 128;
> > > > +module_param(max_reprobe_attempts, int, 0444);
> > > > +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
> > > > +
> > > > +static bool restore_level_on_resume;
> > > > +module_param(restore_level_on_resume, bool, 0444);
> > > > +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
> > > > +
> > > > +/* Bit field values for quirks table */
> > > > +
> > > > +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
> > > > +
> > > > +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
> > > > +
> > > > +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
> > > > +
> > > > +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
> > > > +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
> > > > +
> > > > +static int assign_quirks(const struct dmi_system_id *id)
> > > > +{
> > > > +     if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
> > > > +             restore_level_on_resume = 1;
> > > > +
> > > > +     /* If the module parameter is set, override the quirks table */
> > > > +     if (!backlight_proxy_target) {
> > > > +             if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
> > > > +                     backlight_proxy_target = "amdgpu_bl1";
> > > > +     }
> > > > +
> > > > +     return true;
> > > > +}
> > > > +
> > > > +#define QUIRK_ENTRY(vendor, product, quirks) {          \
> > > > +     .callback = assign_quirks,                      \
> > > > +     .matches = {                                    \
> > > > +             DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
> > > > +             DMI_MATCH(DMI_PRODUCT_VERSION, product) \
> > > > +     },                                              \
> > > > +     .driver_data = (void *)(quirks)                 \
> > > > +}
> > > > +
> > > > +static const struct dmi_system_id quirks_table[] = {
> > > > +     QUIRK_ENTRY(
> > > > +             /* This quirk is preset as of firmware revision HACN31WW */
> > > > +             "LENOVO", "Legion S7 15ACH6",
> > > > +             QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
> > > > +     ),
> > > > +     { }
> > > > +};
> > > > +
> > > >   /**
> > > >    * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
> > > >    * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
> > > > @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
> > > >       return 0;
> > > >   }
> > > >
> > > > +/* Scale the current brightness level of 'from' to the range of 'to'. */
> > > > +static int scale_backlight_level(const struct backlight_device *from,
> > > > +                              const struct backlight_device *to)
> > > > +{
> > > > +     int from_max = from->props.max_brightness;
> > > > +     int from_level = from->props.brightness;
> > > > +     int to_max = to->props.max_brightness;
> > > > +
> > > > +     return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
> > > > +}
> > > > +
> > > >   static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
> > > >   {
> > > >       struct wmi_device *wdev = bl_get_data(bd);
> > > > +     struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> > > > +     struct backlight_device *proxy_target = priv->proxy_target;
> > > > +
> > > > +     if (proxy_target) {
> > > > +             int level = scale_backlight_level(bd, proxy_target);
> > > > +
> > > > +             if (backlight_device_set_brightness(proxy_target, level))
> > > > +                     pr_warn("Failed to relay backlight update to \"%s\"",
> > > > +                             backlight_proxy_target);
> > > > +     }
> > > >
> > > >       return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
> > > >                                    WMI_BRIGHTNESS_MODE_SET,
> > > > @@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
> > > >       .get_brightness = nvidia_wmi_ec_backlight_get_brightness,
> > > >   };
> > > >
> > > > +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
> > > > +{
> > > > +
> > > > +     /*
> > > > +      * On some systems, the EC backlight level gets reset to 100% when
> > > > +      * resuming from suspend, but the backlight device state still reflects
> > > > +      * the pre-suspend value. Refresh the existing state to sync the EC's
> > > > +      * state back up with the kernel's.
> > > > +      */
> > > > +     if (event == PM_POST_SUSPEND) {
> > > > +             struct nvidia_wmi_ec_backlight_priv *p;
> > > > +             int ret;
> > > > +
> > > > +             p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
> > > > +             ret = backlight_update_status(p->bl_dev);
> > > > +
> > > > +             if (ret)
> > > > +                     pr_warn("failed to refresh backlight level: %d", ret);
> > > > +
> > > > +             return NOTIFY_OK;
> > > > +     }
> > > > +
> > > > +     return NOTIFY_DONE;
> > > > +}
> > > > +
> > > > +static void putdev(void *data)
> > > > +{
> > > > +     struct device *dev = data;
> > > > +
> > > > +     put_device(dev);
> > > > +}
> > > > +
> > > >   static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
> > > >   {
> > > > +     struct backlight_device *bdev, *target = NULL;
> > > > +     struct nvidia_wmi_ec_backlight_priv *priv;
> > > >       struct backlight_properties props = {};
> > > > -     struct backlight_device *bdev;
> > > >       u32 source;
> > > >       int ret;
> > > >
> > > > +     /*
> > > > +      * Check quirks tables to see if this system needs any of the firmware
> > > > +      * bug workarounds.
> > > > +      */
> > > > +     dmi_check_system(quirks_table);
> > > > +
> > > > +     if (backlight_proxy_target && backlight_proxy_target[0]) {
> > > > +             static int num_reprobe_attempts;
> > > > +
> > > > +             target = backlight_device_get_by_name(backlight_proxy_target);
> > > > +
> > > > +             if (target) {
> > > > +                     ret = devm_add_action_or_reset(&wdev->dev, putdev,
> > > > +                                                    &target->dev);
> > > > +                     if (ret)
> > > > +                             return ret;
> > > > +             } else {
> > > > +                     /*
> > > > +                      * The target backlight device might not be ready;
> > > > +                      * try again and disable backlight proxying if it
> > > > +                      * fails too many times.
> > > > +                      */
> > > > +                     if (num_reprobe_attempts < max_reprobe_attempts) {
> > > > +                             num_reprobe_attempts++;
> > > > +                             return -EPROBE_DEFER;
> > > > +                     }
> > > > +
> > > > +                     pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
> > > > +                             backlight_proxy_target, max_reprobe_attempts);
> > > > +             }
> > > > +     }
> > > > +
> > > >       ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
> > > >                                  WMI_BRIGHTNESS_MODE_GET, &source);
> > > >       if (ret)
> > > > @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
> > > >                                             &wdev->dev, wdev,
> > > >                                             &nvidia_wmi_ec_backlight_ops,
> > > >                                             &props);
> > > > -     return PTR_ERR_OR_ZERO(bdev);
> > > > +
> > > > +     if (IS_ERR(bdev))
> > > > +             return PTR_ERR(bdev);
> > > > +
> > > > +     priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
> > > > +     if (!priv)
> > > > +             return -ENOMEM;
> > > > +
> > > > +     priv->bl_dev = bdev;
> > > > +
> > > > +     dev_set_drvdata(&wdev->dev, priv);
> > > > +
> > > > +     if (target) {
> > > > +             int level = scale_backlight_level(target, bdev);
> > > > +
> > > > +             if (backlight_device_set_brightness(bdev, level))
> > > > +                     pr_warn("Unable to import initial brightness level from %s.",
> > > > +                             backlight_proxy_target);
> > > > +             priv->proxy_target = target;
> > > > +     }
> > > > +
> > > > +     if (restore_level_on_resume) {
> > > > +             priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
> > > > +             register_pm_notifier(&priv->nb);
> > > > +     }
> > > > +
> > > > +     return 0;
> > > > +}
> > > > +
> > > > +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
> > > > +{
> > > > +     struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> > > > +
> > > > +     if (priv->nb.notifier_call)
> > > > +             unregister_pm_notifier(&priv->nb);
> > > >   }
> > > >
> > > >   #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
> > > > @@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
> > > >               .name = "nvidia-wmi-ec-backlight",
> > > >       },
> > > >       .probe = nvidia_wmi_ec_backlight_probe,
> > > > +     .remove = nvidia_wmi_ec_backlight_remove,
> > > >       .id_table = nvidia_wmi_ec_backlight_id_table,
> > > >   };
> > > >   module_wmi_driver(nvidia_wmi_ec_backlight_driver);

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
  2023-01-31 19:56             ` Alexandru Dinu
@ 2023-02-07 23:23               ` Daniel Dadap
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel Dadap @ 2023-02-07 23:23 UTC (permalink / raw)
  To: Alexandru Dinu
  Cc: platform-driver-x86, Barnabás Pőcze, Hans de Goede,
	markgross, Limonciello, Mario, Deucher, Alexander

On Tue, Jan 31, 2023 at 09:56:03PM +0200, Alexandru Dinu wrote:
> Hello,
> 
> I updated from HACN31WW to the latest HACN39WW and tested on a fresh
> 6.1.7 kernel.
> The brightness issue is not fixed -- same old behaviour: controls only
> work after an initial resume from suspend.
> 
> So far, I've been using your patched version of the
> nvidia-wmi-ec-backlight module which worked great -- manually patching
> the module after each kernel update.
> 
> > somebody else reported a similar issue recently which appears to be resolved by updating to the latest firmware version.
> 
> Can you please point me to this reference?

It was a private report - I've added the reporter to this mail thread.
Unfortunately, it turns out that the "works after update" behavior was
actually a case of the backlight on that system coincidentally working
for a while after the update, before breaking again. On at least that
system, the backlight controls occasionally work as intended on their
own for a while.

> Thank you!
> 
> 
> On Tue, 31 Jan 2023 at 00:00, Daniel Dadap <ddadap@nvidia.com> wrote:
> >
> > Hi Alex,
> >
> > On Thu, Mar 17, 2022 at 12:09:03AM +0200, Alexandru Dinu wrote:
> > > > Note: the Tested-by: line above applies to the previous version of this
> > > > patch; an explicit ACK from the tester is required for it to apply to
> > > > the current version.
> > >
> > > I compiled and tested v2 on 5.16.14.
> > > Everything works as expected: brightness control & level restore work
> > > both on first boot and on subsequent sleep/resume cycles.
> >
> > I ended up abandoning this workaround patch because it was incompatible
> > with Hans's plan to clean up the backlight subsystem. In the meantime,
> > somebody else reported a similar issue recently which appears to be
> > resolved by updating to the latest firmware version. Have you updated to
> > the most recent firmware, and if so, are you still seeing this issue?
> >
> > > Regards,
> > > Alex
> > >
> > >
> > >
> > > On Wed, 16 Mar 2022 at 23:28, Daniel Dadap <ddadap@nvidia.com> wrote:
> > > >
> > > > Sorry, just noticed a typo in a comment:
> > > >
> > > > /* This quirk is preset as of firmware revision HACN31WW */
> > > >
> > > > Obviously that is meant to read "present". I'll fix that with the next
> > > > round of changes, assuming there will be additional review feedback.
> > > >
> > > > On 3/16/22 15:33, Daniel Dadap wrote:
> > > > > Some notebook systems with EC-driven backlight control appear to have a
> > > > > firmware bug which causes the system to use GPU-driven backlight control
> > > > > upon a fresh boot, but then switches to EC-driven backlight control
> > > > > after completing a suspend/resume cycle. All the while, the firmware
> > > > > reports that the backlight is under EC control, regardless of what is
> > > > > actually controlling the backlight brightness.
> > > > >
> > > > > This leads to the following behavior:
> > > > >
> > > > > * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
> > > > >    WMI-wrapped ACPI method erroneously reporting EC control.
> > > > > * nvidia-wmi-ec-backlight does not work until after a suspend/resume
> > > > >    cycle, due to the backlight control actually being GPU-driven.
> > > > > * GPU drivers also register their own backlight handlers: in the case
> > > > >    of the notebook system where this behavior has been observed, both
> > > > >    amdgpu and the NVIDIA proprietary driver register backlight handlers.
> > > > > * The GPU which has backlight control upon a fresh boot (amdgpu in the
> > > > >    case observed so far) can successfully control the backlight through
> > > > >    its backlight driver's sysfs interface, but stops working after the
> > > > >    first suspend/resume cycle.
> > > > > * nvidia-wmi-ec-backlight is unable to control the backlight upon a
> > > > >    fresh boot, but begins to work after the first suspend/resume cycle.
> > > > > * The GPU which does not have backlight control (NVIDIA in this case)
> > > > >    is not able to control the backlight at any point while the system
> > > > >    is in operation. On similar hybrid systems with an EC-controlled
> > > > >    backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
> > > > >    does not register its backlight handler. It has not been determined
> > > > >    whether the non-functional handler registered by the NVIDIA driver
> > > > >    is due to another firmware bug, or a bug in the NVIDIA driver.
> > > > >
> > > > > Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
> > > > > device, it takes precedence over the BACKLIGHT_RAW devices registered
> > > > > by the GPU drivers. This in turn leads to backlight control appearing
> > > > > to be non-functional until after completing a suspend/resume cycle.
> > > > > However, it is still possible to control the backlight through direct
> > > > > interaction with the working GPU driver's backlight sysfs interface.
> > > > >
> > > > > These systems also appear to have a second firmware bug which resets
> > > > > the EC's brightness level to 100% on resume, but leaves the state in
> > > > > the kernel at the pre-suspend level. This causes attempts to save
> > > > > and restore the backlight level across the suspend/resume cycle to
> > > > > fail, due to the level appearing not to change even though it did.
> > > > >
> > > > > In order to work around these issues, add a quirk table to detect
> > > > > systems that are known to show these behaviors. So far, there is
> > > > > only one known system that requires these workarounds, and both
> > > > > issues are present on that system, but the quirks are tracked
> > > > > separately to make it easier to add them to other systems which
> > > > > may exhibit one of the bugs, but not the other. The original systems
> > > > > that this driver was tested on during development do not exhibit
> > > > > either of these quirks.
> > > > >
> > > > > If a system with the "GPU driver has backlight control" quirk is
> > > > > detected, nvidia-wmi-ec-backlight will grab a reference to the working
> > > > > (when freshly booted) GPU backlight handler and relays any backlight
> > > > > brightness level change requests directed at the EC to also be applied
> > > > > to the GPU backlight interface. This leads to redundant updates
> > > > > directed at the GPU backlight driver after a suspend/resume cycle, but
> > > > > it does allow the EC backlight control to work when the system is
> > > > > freshly booted.
> > > > >
> > > > > If a system with the "backlight level reset to full on resume" quirk
> > > > > is detected, nvidia-wmi-ec-backlight will register a PM notifier to
> > > > > reset the backlight to the previous level upon resume.
> > > > >
> > > > > These workarounds are also plumbed through to kernel module parameters,
> > > > > to make it easier for users who suspect they may be affected by one or
> > > > > both of these bugs to test whether these workarounds are effective on
> > > > > their systems as well.
> > > > >
> > > > > Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
> > > > > Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
> > > > > ---
> > > > > Note: the Tested-by: line above applies to the previous version of this
> > > > > patch; an explicit ACK from the tester is required for it to apply to
> > > > > the current version.
> > > > >
> > > > > v2:
> > > > >   * Add readable sysfs files for module params, use linear interpolation
> > > > >     from fixp-arith.h, fix return value of notifier callback, use devm_*()
> > > > >     for kzalloc and put_device. (Barnabás Pőcze <pobrn@protonmail.com>)
> > > > >   * Add comment to denote known firmware versions that exhibit the bugs.
> > > > >     (Mario Limonciello <Mario.Limonciello@amd.com>)
> > > > >   * Unify separate per-quirk tables. (Hans de Goede <hdegoede@redhat.com>)
> > > > >
> > > > >   .../platform/x86/nvidia-wmi-ec-backlight.c    | 196 +++++++++++++++++-
> > > > >   1 file changed, 194 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> > > > > index 61e37194df70..95e1ddf780fc 100644
> > > > > --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> > > > > +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> > > > > @@ -3,8 +3,12 @@
> > > > >    * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
> > > > >    */
> > > > >
> > > > > +#define pr_fmt(f) KBUILD_MODNAME ": " f "\n"
> > > > > +
> > > > >   #include <linux/acpi.h>
> > > > >   #include <linux/backlight.h>
> > > > > +#include <linux/dmi.h>
> > > > > +#include <linux/fixp-arith.h>
> > > > >   #include <linux/mod_devicetable.h>
> > > > >   #include <linux/module.h>
> > > > >   #include <linux/types.h>
> > > > > @@ -75,6 +79,73 @@ struct wmi_brightness_args {
> > > > >       u32 ignored[3];
> > > > >   };
> > > > >
> > > > > +/**
> > > > > + * struct nvidia_wmi_ec_backlight_priv - driver private data
> > > > > + * @bl_dev:       the associated backlight device
> > > > > + * @proxy_target: backlight device which receives relayed brightness changes
> > > > > + * @notifier:     notifier block for resume callback
> > > > > + */
> > > > > +struct nvidia_wmi_ec_backlight_priv {
> > > > > +     struct backlight_device *bl_dev;
> > > > > +     struct backlight_device *proxy_target;
> > > > > +     struct notifier_block nb;
> > > > > +};
> > > > > +
> > > > > +static char *backlight_proxy_target;
> > > > > +module_param(backlight_proxy_target, charp, 0444);
> > > > > +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
> > > > > +
> > > > > +static int max_reprobe_attempts = 128;
> > > > > +module_param(max_reprobe_attempts, int, 0444);
> > > > > +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
> > > > > +
> > > > > +static bool restore_level_on_resume;
> > > > > +module_param(restore_level_on_resume, bool, 0444);
> > > > > +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
> > > > > +
> > > > > +/* Bit field values for quirks table */
> > > > > +
> > > > > +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_RESTORE_LEVEL_ON_RESUME   BIT(0)
> > > > > +
> > > > > +/* bits 1-7: reserved for future quirks; bits 8+: proxy target device names */
> > > > > +
> > > > > +#define NVIDIA_WMI_EC_BACKLIGHT_QUIRK_PROXY_TO_AMDGPU_BL1       BIT(8)
> > > > > +
> > > > > +#define QUIRK(name) NVIDIA_WMI_EC_BACKLIGHT_QUIRK_##name
> > > > > +#define HAS_QUIRK(data, name) (((long) data) & QUIRK(name))
> > > > > +
> > > > > +static int assign_quirks(const struct dmi_system_id *id)
> > > > > +{
> > > > > +     if (HAS_QUIRK(id->driver_data, RESTORE_LEVEL_ON_RESUME))
> > > > > +             restore_level_on_resume = 1;
> > > > > +
> > > > > +     /* If the module parameter is set, override the quirks table */
> > > > > +     if (!backlight_proxy_target) {
> > > > > +             if (HAS_QUIRK(id->driver_data, PROXY_TO_AMDGPU_BL1))
> > > > > +                     backlight_proxy_target = "amdgpu_bl1";
> > > > > +     }
> > > > > +
> > > > > +     return true;
> > > > > +}
> > > > > +
> > > > > +#define QUIRK_ENTRY(vendor, product, quirks) {          \
> > > > > +     .callback = assign_quirks,                      \
> > > > > +     .matches = {                                    \
> > > > > +             DMI_MATCH(DMI_SYS_VENDOR, vendor),      \
> > > > > +             DMI_MATCH(DMI_PRODUCT_VERSION, product) \
> > > > > +     },                                              \
> > > > > +     .driver_data = (void *)(quirks)                 \
> > > > > +}
> > > > > +
> > > > > +static const struct dmi_system_id quirks_table[] = {
> > > > > +     QUIRK_ENTRY(
> > > > > +             /* This quirk is preset as of firmware revision HACN31WW */
> > > > > +             "LENOVO", "Legion S7 15ACH6",
> > > > > +             QUIRK(RESTORE_LEVEL_ON_RESUME) | QUIRK(PROXY_TO_AMDGPU_BL1)
> > > > > +     ),
> > > > > +     { }
> > > > > +};
> > > > > +
> > > > >   /**
> > > > >    * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
> > > > >    * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
> > > > > @@ -119,9 +190,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
> > > > >       return 0;
> > > > >   }
> > > > >
> > > > > +/* Scale the current brightness level of 'from' to the range of 'to'. */
> > > > > +static int scale_backlight_level(const struct backlight_device *from,
> > > > > +                              const struct backlight_device *to)
> > > > > +{
> > > > > +     int from_max = from->props.max_brightness;
> > > > > +     int from_level = from->props.brightness;
> > > > > +     int to_max = to->props.max_brightness;
> > > > > +
> > > > > +     return fixp_linear_interpolate(0, 0, from_max, to_max, from_level);
> > > > > +}
> > > > > +
> > > > >   static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
> > > > >   {
> > > > >       struct wmi_device *wdev = bl_get_data(bd);
> > > > > +     struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> > > > > +     struct backlight_device *proxy_target = priv->proxy_target;
> > > > > +
> > > > > +     if (proxy_target) {
> > > > > +             int level = scale_backlight_level(bd, proxy_target);
> > > > > +
> > > > > +             if (backlight_device_set_brightness(proxy_target, level))
> > > > > +                     pr_warn("Failed to relay backlight update to \"%s\"",
> > > > > +                             backlight_proxy_target);
> > > > > +     }
> > > > >
> > > > >       return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
> > > > >                                    WMI_BRIGHTNESS_MODE_SET,
> > > > > @@ -147,13 +239,78 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
> > > > >       .get_brightness = nvidia_wmi_ec_backlight_get_brightness,
> > > > >   };
> > > > >
> > > > > +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
> > > > > +{
> > > > > +
> > > > > +     /*
> > > > > +      * On some systems, the EC backlight level gets reset to 100% when
> > > > > +      * resuming from suspend, but the backlight device state still reflects
> > > > > +      * the pre-suspend value. Refresh the existing state to sync the EC's
> > > > > +      * state back up with the kernel's.
> > > > > +      */
> > > > > +     if (event == PM_POST_SUSPEND) {
> > > > > +             struct nvidia_wmi_ec_backlight_priv *p;
> > > > > +             int ret;
> > > > > +
> > > > > +             p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
> > > > > +             ret = backlight_update_status(p->bl_dev);
> > > > > +
> > > > > +             if (ret)
> > > > > +                     pr_warn("failed to refresh backlight level: %d", ret);
> > > > > +
> > > > > +             return NOTIFY_OK;
> > > > > +     }
> > > > > +
> > > > > +     return NOTIFY_DONE;
> > > > > +}
> > > > > +
> > > > > +static void putdev(void *data)
> > > > > +{
> > > > > +     struct device *dev = data;
> > > > > +
> > > > > +     put_device(dev);
> > > > > +}
> > > > > +
> > > > >   static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
> > > > >   {
> > > > > +     struct backlight_device *bdev, *target = NULL;
> > > > > +     struct nvidia_wmi_ec_backlight_priv *priv;
> > > > >       struct backlight_properties props = {};
> > > > > -     struct backlight_device *bdev;
> > > > >       u32 source;
> > > > >       int ret;
> > > > >
> > > > > +     /*
> > > > > +      * Check quirks tables to see if this system needs any of the firmware
> > > > > +      * bug workarounds.
> > > > > +      */
> > > > > +     dmi_check_system(quirks_table);
> > > > > +
> > > > > +     if (backlight_proxy_target && backlight_proxy_target[0]) {
> > > > > +             static int num_reprobe_attempts;
> > > > > +
> > > > > +             target = backlight_device_get_by_name(backlight_proxy_target);
> > > > > +
> > > > > +             if (target) {
> > > > > +                     ret = devm_add_action_or_reset(&wdev->dev, putdev,
> > > > > +                                                    &target->dev);
> > > > > +                     if (ret)
> > > > > +                             return ret;
> > > > > +             } else {
> > > > > +                     /*
> > > > > +                      * The target backlight device might not be ready;
> > > > > +                      * try again and disable backlight proxying if it
> > > > > +                      * fails too many times.
> > > > > +                      */
> > > > > +                     if (num_reprobe_attempts < max_reprobe_attempts) {
> > > > > +                             num_reprobe_attempts++;
> > > > > +                             return -EPROBE_DEFER;
> > > > > +                     }
> > > > > +
> > > > > +                     pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
> > > > > +                             backlight_proxy_target, max_reprobe_attempts);
> > > > > +             }
> > > > > +     }
> > > > > +
> > > > >       ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
> > > > >                                  WMI_BRIGHTNESS_MODE_GET, &source);
> > > > >       if (ret)
> > > > > @@ -188,7 +345,41 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
> > > > >                                             &wdev->dev, wdev,
> > > > >                                             &nvidia_wmi_ec_backlight_ops,
> > > > >                                             &props);
> > > > > -     return PTR_ERR_OR_ZERO(bdev);
> > > > > +
> > > > > +     if (IS_ERR(bdev))
> > > > > +             return PTR_ERR(bdev);
> > > > > +
> > > > > +     priv = devm_kzalloc(&wdev->dev, sizeof(*priv), GFP_KERNEL);
> > > > > +     if (!priv)
> > > > > +             return -ENOMEM;
> > > > > +
> > > > > +     priv->bl_dev = bdev;
> > > > > +
> > > > > +     dev_set_drvdata(&wdev->dev, priv);
> > > > > +
> > > > > +     if (target) {
> > > > > +             int level = scale_backlight_level(target, bdev);
> > > > > +
> > > > > +             if (backlight_device_set_brightness(bdev, level))
> > > > > +                     pr_warn("Unable to import initial brightness level from %s.",
> > > > > +                             backlight_proxy_target);
> > > > > +             priv->proxy_target = target;
> > > > > +     }
> > > > > +
> > > > > +     if (restore_level_on_resume) {
> > > > > +             priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
> > > > > +             register_pm_notifier(&priv->nb);
> > > > > +     }
> > > > > +
> > > > > +     return 0;
> > > > > +}
> > > > > +
> > > > > +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
> > > > > +{
> > > > > +     struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> > > > > +
> > > > > +     if (priv->nb.notifier_call)
> > > > > +             unregister_pm_notifier(&priv->nb);
> > > > >   }
> > > > >
> > > > >   #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
> > > > > @@ -204,6 +395,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
> > > > >               .name = "nvidia-wmi-ec-backlight",
> > > > >       },
> > > > >       .probe = nvidia_wmi_ec_backlight_probe,
> > > > > +     .remove = nvidia_wmi_ec_backlight_remove,
> > > > >       .id_table = nvidia_wmi_ec_backlight_id_table,
> > > > >   };
> > > > >   module_wmi_driver(nvidia_wmi_ec_backlight_driver);

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
@ 2022-03-16 20:13 Alexandru Dinu
  0 siblings, 0 replies; 31+ messages in thread
From: Alexandru Dinu @ 2022-03-16 20:13 UTC (permalink / raw)
  To: platform-driver-x86

Hi,

> I'll send out a v2 shortly: Alex, can you
please retest when I do to make sure there aren't any regressions? None
of these suggestions affect the core flow of how either of the
workarounds work, so I'm not expecting any that wouldn't also reproduce
on my EC backlight system that doesn't have either of these problems,
but I can send you the updated version off-list first if you prefer.

It's ok either way. You can send me an updated version off-list.

> Alex, just FYI this was something that came to an AMD bug tracker and wanted you to be aware there are W/A going into nvidia-wmi-ec-backlight for some firmware problems with the mux.
IIRC that was the original suspicion too on the bug reports.

Yes, thanks -- I followed this issue first:
https://gitlab.freedesktop.org/drm/amd/-/issues/1671.

> However I think it's still worth at least noting near the quirk in a comment
what firmware version it was identified.  If later there is confirmation that
a particular firmware version had fixed it the quirk can be adjusted to be
dropped.

That's a good tip. The laptop I tested this on (Lenovo Legion S7
15ACH6) originally shipped with:

UEFI: LENOVO v: HACN27WW date: 08/02/2021

There is an update to version HACN31WW -- see
https://support.lenovo.com/ro/en/downloads/ds550201-bios-update-for-windows-10-64-bit-legion-s7-15ach6
I updated, however, the issue was not addressed, which seems to be
expected given the rather short /changelog:
HACN31WW
BIOS Notification    :
1. Fixed
 1) N/A.
2. Add
  1) Add BOE0A1C support with Cookie and DR Key
3. Modified
  1) Modify MinShortTerm & MinLongTerm PowerLimit value
EC Notification      :
1. Fixed
  1) None.
2. Add
   1) None.
3. Modified
  1)None.

> If you end up introducing a module parameter to try to activate these quirks
it might be viable to ask the folks in those issues to try the v2 of
your patch too
when you're ready with the module parameter.

I posted a link to this mailing list to
https://gitlab.freedesktop.org/drm/amd/-/issues/1671, so people can be
aware and try to test.

Regards,
Alex

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2023-02-07 23:23 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-16  1:25 [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware Daniel Dadap
2022-03-16  2:50 ` Barnabás Pőcze
2022-03-16 15:11   ` Daniel Dadap
2022-03-16 15:29     ` Limonciello, Mario
2022-03-16 17:08       ` Daniel Dadap
2022-03-16 17:21         ` Limonciello, Mario
2022-03-16 17:37           ` Daniel Dadap
2022-03-16 18:25             ` Limonciello, Mario
2022-03-16 19:23               ` Daniel Dadap
2022-03-16 19:25                 ` Limonciello, Mario
2022-03-16 20:33     ` [PATCH v2] " Daniel Dadap
2022-03-16 21:28       ` Daniel Dadap
2022-03-16 22:09         ` Alexandru Dinu
2022-03-16 22:14           ` Alexandru Dinu
2023-01-30 22:00           ` Daniel Dadap
2023-01-31 19:56             ` Alexandru Dinu
2023-02-07 23:23               ` Daniel Dadap
2022-03-17 12:17       ` Hans de Goede
2022-03-17 13:28         ` Daniel Dadap
2022-03-17 16:42           ` Hans de Goede
2022-03-17 16:42             ` Hans de Goede
2022-03-17 17:35             ` Alex Deucher
2022-03-17 18:50               ` Daniel Dadap
2022-03-17 18:50                 ` Daniel Dadap
2022-03-17 18:36             ` Daniel Dadap
2022-03-17 18:36               ` Daniel Dadap
2022-03-18 17:42               ` Hans de Goede
2022-03-18 17:42                 ` Hans de Goede
2022-03-16 16:09 ` [PATCH] " Hans de Goede
2022-03-16 17:22   ` Daniel Dadap
2022-03-16 20:13 Alexandru Dinu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.