All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Barnabás Pőcze" <pobrn@protonmail.com>
To: Daniel Dadap <ddadap@nvidia.com>
Cc: platform-driver-x86@vger.kernel.org,
	Alexandru Dinu <alex.dinu07@gmail.com>
Subject: Re: [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware
Date: Wed, 16 Mar 2022 02:50:27 +0000	[thread overview]
Message-ID: <v_ODhOndx3g1l-BhfKGCB6_nLY83LTc5vz1YDrVEVVF5CmgKUs1x9bmROyWXhmYkeQRVVnvfBnyrFyHaEKqtoZE5P7lKJJ1j_vE0J1Piq2Y=@protonmail.com> (raw)
In-Reply-To: <20220316012516.2233984-1-ddadap@nvidia.com>

Hi


The platform-driver-x86 maintainers should've probably been
CCd. You may or may not know, but the `scripts/get_maintainers.pl`
script can be used to determine the appropriate recipients.

2022. március 16., szerda 2:25 keltezéssel, Daniel Dadap írta:
> Some notebook systems with EC-driven backlight control appear to have a
> firmware bug which causes the system to use GPU-driven backlight control
> upon a fresh boot, but then switches to EC-driven backlight control
> after completing a suspend/resume cycle. All the while, the firmware
> reports that the backlight is under EC control, regardless of what is
> actually controlling the backlight brightness.
>
> This leads to the following behavior:
>
> * nvidia-wmi-ec-backlight gets probed on a fresh boot, due to the
>   WMI-wrapped ACPI method erroneously reporting EC control.
> * nvidia-wmi-ec-backlight does not work until after a suspend/resume
>   cycle, due to the backlight control actually being GPU-driven.
> * GPU drivers also register their own backlight handlers: in the case
>   of the notebook system where this behavior has been observed, both
>   amdgpu and the NVIDIA proprietary driver register backlight handlers.
> * The GPU which has backlight control upon a fresh boot (amdgpu in the
>   case observed so far) can successfully control the backlight through
>   its backlight driver's sysfs interface, but stops working after the
>   first suspend/resume cycle.
> * nvidia-wmi-ec-backlight is unable to control the backlight upon a
>   fresh boot, but begins to work after the first suspend/resume cycle.
> * The GPU which does not have backlight control (NVIDIA in this case)
>   is not able to control the backlight at any point while the system
>   is in operation. On similar hybrid systems with an EC-controlled
>   backlight, and AMD/NVIDIA iGPU/dGPU, the NVIDIA proprietary driver
>   does not register its backlight handler. It has not been determined
>   whether the non-functional handler registered by the NVIDIA driver
>   is due to another firmware bug, or a bug in the NVIDIA driver.
>
> Since nvidia-wmi-ec-backlight registers as a BACKLIGHT_FIRMWARE type
> device, it takes precedence over the BACKLIGHT_RAW devices registered
> by the GPU drivers. This in turn leads to backlight control appearing
> to be non-functional until after completing a suspend/resume cycle.
> However, it is still possible to control the backlight through direct
> interaction with the working GPU driver's backlight sysfs interface.
>
> These systems also appear to have a second firmware bug which resets
> the EC's brightness level to 100% on resume, but leaves the state in
> the kernel at the pre-suspend level. This causes attempts to save
> and restore the backlight level across the suspend/resume cycle to
> fail, due to the level appearing not to change even though it did.
>
> In order to work around these issue, add quirk tables to detect
> systems that are known to show these behaviors. So far, there is
> only one known system that requires these workarounds, and both
> issues are present on that system, but the quirks are tracked in
> separate tables to make it easier to add them to other systems which
> may exhibit one of the bugs, but not the other. The original systems
> that this driver was tested on during development do not exhibit
> either of these quirks.
>
> If a system with the "GPU driver has backlight control" quirk is
> detected, nvidia-wmi-ec-backlight will grab a reference to the working
> (when freshly booted) GPU backlight handler and relays any backlight
> brightness level change requests directed at the EC to also be applied
> to the GPU backlight interface. This leads to redundant updates
> directed at the GPU backlight driver after a suspend/resume cycle, but
> it does allow the EC backlight control to work when the system is
> freshly booted.
>
> If a system with the "backlight level reset to full on resume" quirk
> is detected, nvidia-wmi-ec-backlight will register a PM notifier to
> reset the backlight to the previous level upon resume.
>
> These workarounds are also plumbed through to kernel module parameters,
> to make it easier for users who suspect they may be affected by one or
> both of these bugs to test whether these workarounds are effective on
> their systems as well.
>
> Signed-off-by: Daniel Dadap <ddadap@nvidia.com>
> Tested-by: Alexandru Dinu <alex.dinu07@gmail.com>
> ---
>  .../platform/x86/nvidia-wmi-ec-backlight.c    | 181 +++++++++++++++++-
>  1 file changed, 179 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/platform/x86/nvidia-wmi-ec-backlight.c b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> index 61e37194df70..ccb3b506c12c 100644
> --- a/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> +++ b/drivers/platform/x86/nvidia-wmi-ec-backlight.c
> @@ -3,8 +3,11 @@
>   * Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.
>   */
>
> +#define pr_fmt(f) "%s: " f "\n", KBUILD_MODNAME

`KBUILD_MODNAME` is a string literal, so you can do e.g.

  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt


> +
>  #include <linux/acpi.h>
>  #include <linux/backlight.h>
> +#include <linux/dmi.h>
>  #include <linux/mod_devicetable.h>
>  #include <linux/module.h>
>  #include <linux/types.h>
> @@ -75,6 +78,69 @@ struct wmi_brightness_args {
>  	u32 ignored[3];
>  };
>
> +/**
> + * struct nvidia_wmi_ec_backlight_priv - driver private data
> + * @bl_dev:       the associated backlight device
> + * @proxy_target: backlight device which receives relayed brightness changes
> + * @notifier:     notifier block for resume callback
> + */
> +struct nvidia_wmi_ec_backlight_priv {
> +	struct backlight_device *bl_dev;
> +	struct backlight_device *proxy_target;
> +	struct notifier_block nb;
> +};
> +
> +static char *backlight_proxy_target;
> +module_param(backlight_proxy_target, charp, 0);

It seems these module parameters are neither readable nor writable,
is that intentional?


> +MODULE_PARM_DESC(backlight_proxy_target, "Relay brightness change requests to the named backlight driver, on systems which erroneously report EC backlight control.");
> +
> +static int max_reprobe_attempts = 128;

Can you elaborate how this number was arrived at?


> +module_param(max_reprobe_attempts, int, 0);
> +MODULE_PARM_DESC(max_reprobe_attempts, "Limit of reprobe attempts when relaying brightness change requests.");
> +
> +static bool restore_level_on_resume;
> +module_param(restore_level_on_resume, bool, 0);
> +MODULE_PARM_DESC(restore_level_on_resume, "Restore the backlight level when resuming from suspend, on systems which reset the EC's backlight level on resume.");
> +
> +static int assign_relay_quirk(const struct dmi_system_id *id)
> +{
> +	backlight_proxy_target = id->driver_data;
> +	return true;
> +}
> +
> +#define PROXY_QUIRK_ENTRY(vendor, product, quirk_data) { \
> +	.callback = assign_relay_quirk,                  \
> +	.matches = {                                     \
> +		DMI_MATCH(DMI_SYS_VENDOR, vendor),       \
> +		DMI_MATCH(DMI_PRODUCT_VERSION, product)  \
> +	},                                               \
> +	.driver_data = quirk_data                        \
> +}
> +
> +static const struct dmi_system_id proxy_quirk_table[] = {
> +	PROXY_QUIRK_ENTRY("LENOVO", "Legion S7 15ACH6", "amdgpu_bl1"),
> +	{ }
> +};
> +
> +static int assign_restore_quirk(const struct dmi_system_id *id)
> +{
> +	restore_level_on_resume = true;
> +	return true;
> +}
> +
> +#define RESTORE_QUIRK_ENTRY(vendor, product) {           \
> +	.callback = assign_restore_quirk,                \
> +	.matches = {                                     \
> +		DMI_MATCH(DMI_SYS_VENDOR, vendor),       \
> +		DMI_MATCH(DMI_PRODUCT_VERSION, product), \
> +	}                                                \
> +}
> +
> +static const struct dmi_system_id restore_quirk_table[] = {
> +	RESTORE_QUIRK_ENTRY("LENOVO", "Legion S7 15ACH6"),
> +	{ }
> +};
> +
>  /**
>   * wmi_brightness_notify() - helper function for calling WMI-wrapped ACPI method
>   * @w:    Pointer to the struct wmi_device identified by %WMI_BRIGHTNESS_GUID
> @@ -119,9 +185,30 @@ static int wmi_brightness_notify(struct wmi_device *w, enum wmi_brightness_metho
>  	return 0;
>  }
>
> +static int scale_backlight_level(struct backlight_device *a,
> +				 struct backlight_device *b)
> +{
> +	/* because floating point math in the kernel is annoying */
> +	const int scaling_factor = 65536;
> +	int level = a->props.brightness;
> +	int relative_level = level * scaling_factor / a->props.max_brightness;
> +
> +	return relative_level * b->props.max_brightness / scaling_factor;
> +}

Maybe

  fixp_linear_interpolate(0, 0, a->props.max_brightness, b->props.max_brightness, a->props.brightness);

? (from `linux/fixp-arith.h`)


> +
>  static int nvidia_wmi_ec_backlight_update_status(struct backlight_device *bd)
>  {
>  	struct wmi_device *wdev = bl_get_data(bd);
> +	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> +	struct backlight_device *proxy_target = priv->proxy_target;
> +
> +	if (proxy_target) {
> +		int level = scale_backlight_level(bd, proxy_target);
> +
> +		if (backlight_device_set_brightness(proxy_target, level))
> +			pr_warn("Failed to relay backlight update to \"%s\"",
> +				backlight_proxy_target);
> +	}
>
>  	return wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_LEVEL,
>  	                             WMI_BRIGHTNESS_MODE_SET,
> @@ -147,13 +234,65 @@ static const struct backlight_ops nvidia_wmi_ec_backlight_ops = {
>  	.get_brightness = nvidia_wmi_ec_backlight_get_brightness,
>  };
>
> +static int nvidia_wmi_ec_backlight_pm_notifier(struct notifier_block *nb, unsigned long event, void *d)
> +{
> +
> +	/*
> +	 * On some systems, the EC backlight level gets reset to 100% when
> +	 * resuming from suspend, but the backlight device state still reflects
> +	 * the pre-suspend value. Refresh the existing state to sync the EC's
> +	 * state back up with the kernel's.
> +	 */
> +	if (event == PM_POST_SUSPEND) {
> +		struct nvidia_wmi_ec_backlight_priv *p;
> +
> +		p = container_of(nb, struct nvidia_wmi_ec_backlight_priv, nb);
> +		return backlight_update_status(p->bl_dev);

`backlight_update_status()` returns a negative errno while the notifier chain
expects something else. It would probably be better to return `NOTIFY_DONE`
in all cases. Currently a suitable error from `backlight_update_status()` will
stop the notifier chain.


> +	}
> +
> +	return 0;
> +}
> +
>  static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ctx)
>  {
> +	struct backlight_device *bdev, *target = NULL;
> +	struct nvidia_wmi_ec_backlight_priv *priv;
>  	struct backlight_properties props = {};
> -	struct backlight_device *bdev;
>  	u32 source;
>  	int ret;
>
> +	/*
> +	 * Check quirks tables to see if this system needs any of the firmware
> +	 * bug workarounds.
> +	 */
> +
> +	/* User-set quirks from the module parameters take precedence */
> +	if (!backlight_proxy_target)
> +		dmi_check_system(proxy_quirk_table);
> +
> +	dmi_check_system(restore_quirk_table);
> +
> +	if (backlight_proxy_target && backlight_proxy_target[0]) {
> +		static int num_reprobe_attempts;
> +
> +		target = backlight_device_get_by_name(backlight_proxy_target);
> +
> +		if (!target) {
> +			/*
> +			 * The target backlight device might not be ready;
> +			 * try again and disable backlight proxying if it
> +			 * fails too many times.
> +			 */
> +			if (num_reprobe_attempts < max_reprobe_attempts) {
> +				num_reprobe_attempts++;
> +				return -EPROBE_DEFER;
> +			}
> +
> +			pr_warn("Unable to acquire %s after %d attempts. Disabling backlight proxy.",
> +				backlight_proxy_target, max_reprobe_attempts);
> +		}
> +	}

I think `target` is not put in case of error. You probably need to add something like:

  if (target) {
    ret = devm_add_action_or_reset(&wdev->dev, put_device_wrapper, target);
    if (ret < 0)
      return ret;
  }


> +
>  	ret = wmi_brightness_notify(wdev, WMI_BRIGHTNESS_METHOD_SOURCE,
>  	                           WMI_BRIGHTNESS_MODE_GET, &source);
>  	if (ret)
> @@ -188,7 +327,44 @@ static int nvidia_wmi_ec_backlight_probe(struct wmi_device *wdev, const void *ct
>  					      &wdev->dev, wdev,
>  					      &nvidia_wmi_ec_backlight_ops,
>  					      &props);
> -	return PTR_ERR_OR_ZERO(bdev);
> +
> +	if (IS_ERR(bdev))
> +		return PTR_ERR(bdev);
> +
> +	priv = kzalloc(sizeof(*priv), GFP_KERNEL);

`devm_kzalloc()` would probably be better and you should check if `!priv`.


> +	priv->bl_dev = bdev;
> +
> +	dev_set_drvdata(&wdev->dev, priv);
> +
> +	if (target) {
> +		int level = scale_backlight_level(target, bdev);
> +
> +		if (backlight_device_set_brightness(bdev, level))
> +			pr_warn("Unable to import initial brightness level from %s.",
> +				backlight_proxy_target);
> +		priv->proxy_target = target;
> +	}
> +
> +	if (restore_level_on_resume) {
> +		priv->nb.notifier_call = nvidia_wmi_ec_backlight_pm_notifier;
> +		register_pm_notifier(&priv->nb);
> +	}
> +
> +	return 0;
> +}
> +
> +static void nvidia_wmi_ec_backlight_remove(struct wmi_device *wdev)
> +{
> +	struct nvidia_wmi_ec_backlight_priv *priv = dev_get_drvdata(&wdev->dev);
> +	struct backlight_device *proxy_target = priv->proxy_target;
> +
> +	if (proxy_target)
> +		put_device(&proxy_target->dev);

If you switch to `devm_add_action_or_reset()`, this will not be needed.


> +
> +	if (priv->nb.notifier_call)
> +		unregister_pm_notifier(&priv->nb);
> +
> +	kfree(priv);

If you switch to `devm_kzalloc()`, this won't be needed.


>  }
>
>  #define WMI_BRIGHTNESS_GUID "603E9613-EF25-4338-A3D0-C46177516DB7"
> @@ -204,6 +380,7 @@ static struct wmi_driver nvidia_wmi_ec_backlight_driver = {
>  		.name = "nvidia-wmi-ec-backlight",
>  	},
>  	.probe = nvidia_wmi_ec_backlight_probe,
> +	.remove = nvidia_wmi_ec_backlight_remove,
>  	.id_table = nvidia_wmi_ec_backlight_id_table,
>  };
>  module_wmi_driver(nvidia_wmi_ec_backlight_driver);
> --
> 2.27.0
>

Lastly, is it expected that these bugs will be properly fixed?


Regards,
Barnabás Pőcze

  reply	other threads:[~2022-03-16  2:50 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-16  1:25 [PATCH] nvidia-wmi-ec-backlight: Add workarounds for confused firmware Daniel Dadap
2022-03-16  2:50 ` Barnabás Pőcze [this message]
2022-03-16 15:11   ` Daniel Dadap
2022-03-16 15:29     ` Limonciello, Mario
2022-03-16 17:08       ` Daniel Dadap
2022-03-16 17:21         ` Limonciello, Mario
2022-03-16 17:37           ` Daniel Dadap
2022-03-16 18:25             ` Limonciello, Mario
2022-03-16 19:23               ` Daniel Dadap
2022-03-16 19:25                 ` Limonciello, Mario
2022-03-16 20:33     ` [PATCH v2] " Daniel Dadap
2022-03-16 21:28       ` Daniel Dadap
2022-03-16 22:09         ` Alexandru Dinu
2022-03-16 22:14           ` Alexandru Dinu
2023-01-30 22:00           ` Daniel Dadap
2023-01-31 19:56             ` Alexandru Dinu
2023-02-07 23:23               ` Daniel Dadap
2022-03-17 12:17       ` Hans de Goede
2022-03-17 13:28         ` Daniel Dadap
2022-03-17 16:42           ` Hans de Goede
2022-03-17 16:42             ` Hans de Goede
2022-03-17 17:35             ` Alex Deucher
2022-03-17 18:50               ` Daniel Dadap
2022-03-17 18:50                 ` Daniel Dadap
2022-03-17 18:36             ` Daniel Dadap
2022-03-17 18:36               ` Daniel Dadap
2022-03-18 17:42               ` Hans de Goede
2022-03-18 17:42                 ` Hans de Goede
2022-03-16 16:09 ` [PATCH] " Hans de Goede
2022-03-16 17:22   ` Daniel Dadap
2022-03-16 20:13 Alexandru Dinu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='v_ODhOndx3g1l-BhfKGCB6_nLY83LTc5vz1YDrVEVVF5CmgKUs1x9bmROyWXhmYkeQRVVnvfBnyrFyHaEKqtoZE5P7lKJJ1j_vE0J1Piq2Y=@protonmail.com' \
    --to=pobrn@protonmail.com \
    --cc=alex.dinu07@gmail.com \
    --cc=ddadap@nvidia.com \
    --cc=platform-driver-x86@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.