All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] acpi/button: revert v4.10 behavior
@ 2017-05-10 16:12 Benjamin Tissoires
  2017-05-10 16:12 ` [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode" Benjamin Tissoires
                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-10 16:12 UTC (permalink / raw)
  To: Rafael J . Wysocki, Lv Zheng; +Cc: Jiri Eischmann, linux-acpi, linux-kernel

The new default 'open' behavior for acpi_lid_initialize_state() is just
wrong. It breaks professional laptops with a docking station [1].

Booting the laptop with the LID closed is something common and now there
is no way of knowing the actual state of the LID switch at boot. Please
use a user-space solution as described in 2/2 if you really don't want to
add quirks in the kernel.

Cheers,
Benjamin

[1] https://bugzilla.gnome.org/show_bug.cgi?id=782380

Benjamin Tissoires (2):
  Revert "ACPI / button: Remove lid_init_state=method mode"
  Revert "ACPI / button: Change default behavior to lid_init_state=open"

 Documentation/acpi/acpi-lid.txt | 16 ++++++++++++----
 drivers/acpi/button.c           | 11 ++++++++++-
 2 files changed, 22 insertions(+), 5 deletions(-)

-- 
2.9.3


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
  2017-05-10 16:12 [PATCH 0/2] acpi/button: revert v4.10 behavior Benjamin Tissoires
@ 2017-05-10 16:12 ` Benjamin Tissoires
  2017-05-11  0:58   ` Zheng, Lv
  2017-05-10 16:12 ` [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open" Benjamin Tissoires
  2017-05-11  0:09 ` [PATCH 0/2] acpi/button: revert v4.10 behavior Rafael J. Wysocki
  2 siblings, 1 reply; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-10 16:12 UTC (permalink / raw)
  To: Rafael J . Wysocki, Lv Zheng; +Cc: Jiri Eischmann, linux-acpi, linux-kernel

This reverts commit ecb10b694b72ca5ea51b3c90a71ff2a11963425a.

Even if the method can be buggy some times, it's still a need
when you boot a laptop with a lid closed and an external monitor
connected (typical situation when using a docking station)

Note: this option has been removed without being deprecated, which
is terrible in term of distribution handling. The new default "open"
is plain wrong and we don't even have the chance to keep the current
default option because it's not there anymore.

Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
---
 Documentation/acpi/acpi-lid.txt | 16 ++++++++++++----
 drivers/acpi/button.c           |  9 +++++++++
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/Documentation/acpi/acpi-lid.txt b/Documentation/acpi/acpi-lid.txt
index 22cb309..effe7af 100644
--- a/Documentation/acpi/acpi-lid.txt
+++ b/Documentation/acpi/acpi-lid.txt
@@ -59,20 +59,28 @@ button driver uses the following 3 modes in order not to trigger issues.
 If the userspace hasn't been prepared to ignore the unreliable "opened"
 events and the unreliable initial state notification, Linux users can use
 the following kernel parameters to handle the possible issues:
-A. button.lid_init_state=open:
+A. button.lid_init_state=method:
+   When this option is specified, the ACPI button driver reports the
+   initial lid state using the returning value of the _LID control method
+   and whether the "opened"/"closed" events are paired fully relies on the
+   firmware implementation.
+   This option can be used to fix some platforms where the returning value
+   of the _LID control method is reliable but the initial lid state
+   notification is missing.
+   This option is the default behavior during the period the userspace
+   isn't ready to handle the buggy AML tables.
+B. button.lid_init_state=open:
    When this option is specified, the ACPI button driver always reports the
    initial lid state as "opened" and whether the "opened"/"closed" events
    are paired fully relies on the firmware implementation.
    This may fix some platforms where the returning value of the _LID
    control method is not reliable and the initial lid state notification is
    missing.
-   This option is the default behavior during the period the userspace
-   isn't ready to handle the buggy AML tables.
 
 If the userspace has been prepared to ignore the unreliable "opened" events
 and the unreliable initial state notification, Linux users should always
 use the following kernel parameter:
-B. button.lid_init_state=ignore:
+C. button.lid_init_state=ignore:
    When this option is specified, the ACPI button driver never reports the
    initial lid state and there is a compensation mechanism implemented to
    ensure that the reliable "closed" notifications can always be delievered
diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
index 668137e..6d5a8c1 100644
--- a/drivers/acpi/button.c
+++ b/drivers/acpi/button.c
@@ -57,6 +57,7 @@
 
 #define ACPI_BUTTON_LID_INIT_IGNORE	0x00
 #define ACPI_BUTTON_LID_INIT_OPEN	0x01
+#define ACPI_BUTTON_LID_INIT_METHOD	0x02
 
 #define _COMPONENT		ACPI_BUTTON_COMPONENT
 ACPI_MODULE_NAME("button");
@@ -376,6 +377,9 @@ static void acpi_lid_initialize_state(struct acpi_device *device)
 	case ACPI_BUTTON_LID_INIT_OPEN:
 		(void)acpi_lid_notify_state(device, 1);
 		break;
+	case ACPI_BUTTON_LID_INIT_METHOD:
+		(void)acpi_lid_update_state(device);
+		break;
 	case ACPI_BUTTON_LID_INIT_IGNORE:
 	default:
 		break;
@@ -559,6 +563,9 @@ static int param_set_lid_init_state(const char *val, struct kernel_param *kp)
 	if (!strncmp(val, "open", sizeof("open") - 1)) {
 		lid_init_state = ACPI_BUTTON_LID_INIT_OPEN;
 		pr_info("Notify initial lid state as open\n");
+	} else if (!strncmp(val, "method", sizeof("method") - 1)) {
+		lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
+		pr_info("Notify initial lid state with _LID return value\n");
 	} else if (!strncmp(val, "ignore", sizeof("ignore") - 1)) {
 		lid_init_state = ACPI_BUTTON_LID_INIT_IGNORE;
 		pr_info("Do not notify initial lid state\n");
@@ -572,6 +579,8 @@ static int param_get_lid_init_state(char *buffer, struct kernel_param *kp)
 	switch (lid_init_state) {
 	case ACPI_BUTTON_LID_INIT_OPEN:
 		return sprintf(buffer, "open");
+	case ACPI_BUTTON_LID_INIT_METHOD:
+		return sprintf(buffer, "method");
 	case ACPI_BUTTON_LID_INIT_IGNORE:
 		return sprintf(buffer, "ignore");
 	default:
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-10 16:12 [PATCH 0/2] acpi/button: revert v4.10 behavior Benjamin Tissoires
  2017-05-10 16:12 ` [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode" Benjamin Tissoires
@ 2017-05-10 16:12 ` Benjamin Tissoires
  2017-05-11  0:59   ` Zheng, Lv
  2017-05-11  0:09 ` [PATCH 0/2] acpi/button: revert v4.10 behavior Rafael J. Wysocki
  2 siblings, 1 reply; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-10 16:12 UTC (permalink / raw)
  To: Rafael J . Wysocki, Lv Zheng; +Cc: Jiri Eischmann, linux-acpi, linux-kernel

This reverts commit 77e9a4aa9de10cc1418bf9a892366988802a8025.

Even if the method implementation can be buggy on some platform,
the "open" choice is worse. It breaks docking stations basically
and there is no way to have a user-space hwdb to fix that.

On the contrary, it's rather easy in user-space to have a hwdb
with the problematic platforms. Then, libinput (1.7.0+) can fix
the state of the LID switch for us: you need to set the udev
property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.

When libinput detects internal keyboard events, it will
overwrite the state of the switch to open, making it reliable
again. Given that logind only checks the LID switch value after
a timeout, we can assume the user will use the internal keyboard
before this timeout expires.

For example, such a hwdb entry is:

libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
 LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open

Link: https://bugzilla.gnome.org/show_bug.cgi?id=782380
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
---
 drivers/acpi/button.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
index 6d5a8c1..e19f530 100644
--- a/drivers/acpi/button.c
+++ b/drivers/acpi/button.c
@@ -113,7 +113,7 @@ struct acpi_button {
 
 static BLOCKING_NOTIFIER_HEAD(acpi_lid_notifier);
 static struct acpi_device *lid_device;
-static u8 lid_init_state = ACPI_BUTTON_LID_INIT_OPEN;
+static u8 lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
 
 static unsigned long lid_report_interval __read_mostly = 500;
 module_param(lid_report_interval, ulong, 0644);
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] acpi/button: revert v4.10 behavior
  2017-05-10 16:12 [PATCH 0/2] acpi/button: revert v4.10 behavior Benjamin Tissoires
  2017-05-10 16:12 ` [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode" Benjamin Tissoires
  2017-05-10 16:12 ` [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open" Benjamin Tissoires
@ 2017-05-11  0:09 ` Rafael J. Wysocki
  2017-05-11  9:33   ` Benjamin Tissoires
  2 siblings, 1 reply; 38+ messages in thread
From: Rafael J. Wysocki @ 2017-05-11  0:09 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Rafael J . Wysocki, Lv Zheng, Jiri Eischmann,
	ACPI Devel Maling List, Linux Kernel Mailing List

On Wed, May 10, 2017 at 6:12 PM, Benjamin Tissoires
<benjamin.tissoires@redhat.com> wrote:
> The new default 'open' behavior for acpi_lid_initialize_state() is just
> wrong. It breaks professional laptops with a docking station [1].
>
> Booting the laptop with the LID closed is something common and now there
> is no way of knowing the actual state of the LID switch at boot. Please
> use a user-space solution as described in 2/2 if you really don't want to
> add quirks in the kernel.
>
> Cheers,
> Benjamin
>
> [1] https://bugzilla.gnome.org/show_bug.cgi?id=782380
>
> Benjamin Tissoires (2):
>   Revert "ACPI / button: Remove lid_init_state=method mode"
>   Revert "ACPI / button: Change default behavior to lid_init_state=open"
>
>  Documentation/acpi/acpi-lid.txt | 16 ++++++++++++----
>  drivers/acpi/button.c           | 11 ++++++++++-
>  2 files changed, 22 insertions(+), 5 deletions(-)

Well, have you seen the recent series
(http://marc.info/?l=linux-acpi&m=149431335204701&w=2 and the
following) from Lv?

He evidently agrees with you that "ACPI / button: Remove
lid_init_state=method mode" should be reverted, but then there are
differences.

Can you please have a look at that and let me know what's wrong with
it in your view?  I'd like to have a full picture if poss.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
  2017-05-10 16:12 ` [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode" Benjamin Tissoires
@ 2017-05-11  0:58   ` Zheng, Lv
  2017-05-11  1:19     ` Zheng, Lv
  0 siblings, 1 reply; 38+ messages in thread
From: Zheng, Lv @ 2017-05-11  0:58 UTC (permalink / raw)
  To: Benjamin Tissoires, Rafael J . Wysocki
  Cc: Jiri Eischmann, linux-acpi, linux-kernel

Hi, Benjiamin

> From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> Sent: Thursday, May 11, 2017 12:13 AM
> To: Rafael J . Wysocki <rjw@rjwysocki.net>; Zheng, Lv <lv.zheng@intel.com>
> Cc: Jiri Eischmann <jeischma@redhat.com>; linux-acpi@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
> 
> This reverts commit ecb10b694b72ca5ea51b3c90a71ff2a11963425a.
> 
> Even if the method can be buggy some times, it's still a need
> when you boot a laptop with a lid closed and an external monitor
> connected (typical situation when using a docking station)
> 
> Note: this option has been removed without being deprecated, which
> is terrible in term of distribution handling. The new default "open"
> is plain wrong and we don't even have the chance to keep the current
> default option because it's not there anymore.

I have reverted this:
https://patchwork.kernel.org/patch/9717109/

Also I noticed one thing:
https://patchwork.kernel.org/patch/9717111/

After I changed kernel lid notification to always send lid return value to other drivers.
In order to find out a single driver mode (without platform quirks) to handle both cases.
I failed.
I should still send close init state to the user space program to work around the external monitor issues.

So as you know, we need to send "open" init state to the user space program to work around suspend/resume loop issue.

Then for such platforms, how can ACPI button driver automatically detect if an external monitor is there?
Unless we fix the BIOS code to make lid return value work as user space's expectation.
OK, then this creates an endless business in ACPI community to "re-develop" BIOS tables if they cannot meat user space's expectation.
That sucks.

It sound the best way is the user space program equipped with hwdb quirks who knows everything to alter acpi button driver mode from user space to work around this.
For example:

If hwdb is hit, and there is no external monitor, then
 Echo "open" > /sys/module/button/parameters/lid_init_state
If hwdb is not hit or there is an external monitor, then
If hwdb is hit, and there is no external monitor, then
 Echo "close" > /sys/module/button/parameters/lid_init_state

So PATCH 2 is not useful.
Reverting that can trigger a regression loop we surely do not want to handle.

Thanks and best regards
Lv

> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
> ---
>  Documentation/acpi/acpi-lid.txt | 16 ++++++++++++----
>  drivers/acpi/button.c           |  9 +++++++++
>  2 files changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/acpi/acpi-lid.txt b/Documentation/acpi/acpi-lid.txt
> index 22cb309..effe7af 100644
> --- a/Documentation/acpi/acpi-lid.txt
> +++ b/Documentation/acpi/acpi-lid.txt
> @@ -59,20 +59,28 @@ button driver uses the following 3 modes in order not to trigger issues.
>  If the userspace hasn't been prepared to ignore the unreliable "opened"
>  events and the unreliable initial state notification, Linux users can use
>  the following kernel parameters to handle the possible issues:
> -A. button.lid_init_state=open:
> +A. button.lid_init_state=method:
> +   When this option is specified, the ACPI button driver reports the
> +   initial lid state using the returning value of the _LID control method
> +   and whether the "opened"/"closed" events are paired fully relies on the
> +   firmware implementation.
> +   This option can be used to fix some platforms where the returning value
> +   of the _LID control method is reliable but the initial lid state
> +   notification is missing.
> +   This option is the default behavior during the period the userspace
> +   isn't ready to handle the buggy AML tables.
> +B. button.lid_init_state=open:
>     When this option is specified, the ACPI button driver always reports the
>     initial lid state as "opened" and whether the "opened"/"closed" events
>     are paired fully relies on the firmware implementation.
>     This may fix some platforms where the returning value of the _LID
>     control method is not reliable and the initial lid state notification is
>     missing.
> -   This option is the default behavior during the period the userspace
> -   isn't ready to handle the buggy AML tables.
> 
>  If the userspace has been prepared to ignore the unreliable "opened" events
>  and the unreliable initial state notification, Linux users should always
>  use the following kernel parameter:
> -B. button.lid_init_state=ignore:
> +C. button.lid_init_state=ignore:
>     When this option is specified, the ACPI button driver never reports the
>     initial lid state and there is a compensation mechanism implemented to
>     ensure that the reliable "closed" notifications can always be delievered
> diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> index 668137e..6d5a8c1 100644
> --- a/drivers/acpi/button.c
> +++ b/drivers/acpi/button.c
> @@ -57,6 +57,7 @@
> 
>  #define ACPI_BUTTON_LID_INIT_IGNORE	0x00
>  #define ACPI_BUTTON_LID_INIT_OPEN	0x01
> +#define ACPI_BUTTON_LID_INIT_METHOD	0x02
> 
>  #define _COMPONENT		ACPI_BUTTON_COMPONENT
>  ACPI_MODULE_NAME("button");
> @@ -376,6 +377,9 @@ static void acpi_lid_initialize_state(struct acpi_device *device)
>  	case ACPI_BUTTON_LID_INIT_OPEN:
>  		(void)acpi_lid_notify_state(device, 1);
>  		break;
> +	case ACPI_BUTTON_LID_INIT_METHOD:
> +		(void)acpi_lid_update_state(device);
> +		break;
>  	case ACPI_BUTTON_LID_INIT_IGNORE:
>  	default:
>  		break;
> @@ -559,6 +563,9 @@ static int param_set_lid_init_state(const char *val, struct kernel_param *kp)
>  	if (!strncmp(val, "open", sizeof("open") - 1)) {
>  		lid_init_state = ACPI_BUTTON_LID_INIT_OPEN;
>  		pr_info("Notify initial lid state as open\n");
> +	} else if (!strncmp(val, "method", sizeof("method") - 1)) {
> +		lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
> +		pr_info("Notify initial lid state with _LID return value\n");
>  	} else if (!strncmp(val, "ignore", sizeof("ignore") - 1)) {
>  		lid_init_state = ACPI_BUTTON_LID_INIT_IGNORE;
>  		pr_info("Do not notify initial lid state\n");
> @@ -572,6 +579,8 @@ static int param_get_lid_init_state(char *buffer, struct kernel_param *kp)
>  	switch (lid_init_state) {
>  	case ACPI_BUTTON_LID_INIT_OPEN:
>  		return sprintf(buffer, "open");
> +	case ACPI_BUTTON_LID_INIT_METHOD:
> +		return sprintf(buffer, "method");
>  	case ACPI_BUTTON_LID_INIT_IGNORE:
>  		return sprintf(buffer, "ignore");
>  	default:
> --
> 2.9.3


^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-10 16:12 ` [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open" Benjamin Tissoires
@ 2017-05-11  0:59   ` Zheng, Lv
  2017-05-11  9:45     ` Benjamin Tissoires
  0 siblings, 1 reply; 38+ messages in thread
From: Zheng, Lv @ 2017-05-11  0:59 UTC (permalink / raw)
  To: Benjamin Tissoires, Rafael J . Wysocki
  Cc: Jiri Eischmann, linux-acpi, linux-kernel

Hi,

> From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> Subject: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> 
> This reverts commit 77e9a4aa9de10cc1418bf9a892366988802a8025.
> 
> Even if the method implementation can be buggy on some platform,
> the "open" choice is worse. It breaks docking stations basically
> and there is no way to have a user-space hwdb to fix that.
> 
> On the contrary, it's rather easy in user-space to have a hwdb
> with the problematic platforms. Then, libinput (1.7.0+) can fix
> the state of the LID switch for us: you need to set the udev
> property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
> 
> When libinput detects internal keyboard events, it will
> overwrite the state of the switch to open, making it reliable
> again. Given that logind only checks the LID switch value after
> a timeout, we can assume the user will use the internal keyboard
> before this timeout expires.
> 
> For example, such a hwdb entry is:
> 
> libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
>  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open

For the reason mentioned previously and proofs here (see patch descriptions):
https://patchwork.kernel.org/patch/9717111/
We shouldn't do this.

Thanks and best regards
Lv

> 
> Link: https://bugzilla.gnome.org/show_bug.cgi?id=782380
> Cc: stable@vger.kernel.org
> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
> ---
>  drivers/acpi/button.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> index 6d5a8c1..e19f530 100644
> --- a/drivers/acpi/button.c
> +++ b/drivers/acpi/button.c
> @@ -113,7 +113,7 @@ struct acpi_button {
> 
>  static BLOCKING_NOTIFIER_HEAD(acpi_lid_notifier);
>  static struct acpi_device *lid_device;
> -static u8 lid_init_state = ACPI_BUTTON_LID_INIT_OPEN;
> +static u8 lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
> 
>  static unsigned long lid_report_interval __read_mostly = 500;
>  module_param(lid_report_interval, ulong, 0644);
> --
> 2.9.3


^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
  2017-05-11  0:58   ` Zheng, Lv
@ 2017-05-11  1:19     ` Zheng, Lv
  2017-05-11 10:12       ` Benjamin Tissoires
  0 siblings, 1 reply; 38+ messages in thread
From: Zheng, Lv @ 2017-05-11  1:19 UTC (permalink / raw)
  To: Zheng, Lv, Benjamin Tissoires, Rafael J . Wysocki
  Cc: Jiri Eischmann, linux-acpi, linux-kernel

Hi, Benjamin

> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Zheng,
> Lv
> Subject: RE: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
> 
> Hi, Benjiamin
> 
> > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > Sent: Thursday, May 11, 2017 12:13 AM
> > To: Rafael J . Wysocki <rjw@rjwysocki.net>; Zheng, Lv <lv.zheng@intel.com>
> > Cc: Jiri Eischmann <jeischma@redhat.com>; linux-acpi@vger.kernel.org; linux-kernel@vger.kernel.org
> > Subject: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
> >
> > This reverts commit ecb10b694b72ca5ea51b3c90a71ff2a11963425a.
> >
> > Even if the method can be buggy some times, it's still a need
> > when you boot a laptop with a lid closed and an external monitor
> > connected (typical situation when using a docking station)
> >
> > Note: this option has been removed without being deprecated, which
> > is terrible in term of distribution handling. The new default "open"
> > is plain wrong and we don't even have the chance to keep the current
> > default option because it's not there anymore.
> 
> I have reverted this:
> https://patchwork.kernel.org/patch/9717109/
> 
> Also I noticed one thing:
> https://patchwork.kernel.org/patch/9717111/
> 
> After I changed kernel lid notification to always send lid return value to other drivers.
> In order to find out a single driver mode (without platform quirks) to handle both cases.
> I failed.
> I should still send close init state to the user space program to work around the external monitor
> issues.
> 
> So as you know, we need to send "open" init state to the user space program to work around
> suspend/resume loop issue.
> 
> Then for such platforms, how can ACPI button driver automatically detect if an external monitor is
> there?
> Unless we fix the BIOS code to make lid return value work as user space's expectation.
> OK, then this creates an endless business in ACPI community to "re-develop" BIOS tables if they cannot
> meat user space's expectation.
> That sucks.
> 
> It sound the best way is the user space program equipped with hwdb quirks who knows everything to
> alter acpi button driver mode from user space to work around this.
> For example:
> 
> If hwdb is hit, and there is no external monitor, then
>  Echo "open" > /sys/module/button/parameters/lid_init_state
> If hwdb is not hit or there is an external monitor, then
> If hwdb is hit, and there is no external monitor, then
>  Echo "close" > /sys/module/button/parameters/lid_init_state

Let me do re-wording.
If hwdb is not hit
  echo "method" > /sys/module/button/parameters/lid_init_state
If hwdb is hit, and there is no external monitor, then
  echo "open" > /sys/module/button/parameters/lid_init_state
If hwdb is hit, and there is an external monitor
  echo "close" > /sys/module/button/parameters/lid_init_state
Then this always assumes the hard requirements of the platform quirks.
And it then looks it's better to do such switches in the user space as ACPI button driver doesn't know and doesn't have to know the existence of the external monitor.

However, is that possible to not introduce platform quirks?
For example, for systemd:
If it detected ACPI lid device, automatically switch to an "IgnoreLidOpenEvent" mode (like nouveau drivers' ignorelid=Y option).
BTW, which program is responsible for lighting on internal/external monitors?
Can it determine that by its own without listening to the lid key events?
This is what we preferred.
If all of above usage models are corrected, we'll change acpi button driver default mode to "ignore".

Another problem for changing default mode back to "method" is:
If we changed button driver default mode back to "method", then ACPI community will be flooded of suspend/resume loop bug reports.
After we root cause that's not a kernel problem, do we have mean in other community to handle such reports?
For example, to collect hwdb entries.

Thanks and best regards
Lv

> 
> So PATCH 2 is not useful.
> Reverting that can trigger a regression loop we surely do not want to handle.
> 
> Thanks and best regards
> Lv
> 
> >
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
> > ---
> >  Documentation/acpi/acpi-lid.txt | 16 ++++++++++++----
> >  drivers/acpi/button.c           |  9 +++++++++
> >  2 files changed, 21 insertions(+), 4 deletions(-)
> >
> > diff --git a/Documentation/acpi/acpi-lid.txt b/Documentation/acpi/acpi-lid.txt
> > index 22cb309..effe7af 100644
> > --- a/Documentation/acpi/acpi-lid.txt
> > +++ b/Documentation/acpi/acpi-lid.txt
> > @@ -59,20 +59,28 @@ button driver uses the following 3 modes in order not to trigger issues.
> >  If the userspace hasn't been prepared to ignore the unreliable "opened"
> >  events and the unreliable initial state notification, Linux users can use
> >  the following kernel parameters to handle the possible issues:
> > -A. button.lid_init_state=open:
> > +A. button.lid_init_state=method:
> > +   When this option is specified, the ACPI button driver reports the
> > +   initial lid state using the returning value of the _LID control method
> > +   and whether the "opened"/"closed" events are paired fully relies on the
> > +   firmware implementation.
> > +   This option can be used to fix some platforms where the returning value
> > +   of the _LID control method is reliable but the initial lid state
> > +   notification is missing.
> > +   This option is the default behavior during the period the userspace
> > +   isn't ready to handle the buggy AML tables.
> > +B. button.lid_init_state=open:
> >     When this option is specified, the ACPI button driver always reports the
> >     initial lid state as "opened" and whether the "opened"/"closed" events
> >     are paired fully relies on the firmware implementation.
> >     This may fix some platforms where the returning value of the _LID
> >     control method is not reliable and the initial lid state notification is
> >     missing.
> > -   This option is the default behavior during the period the userspace
> > -   isn't ready to handle the buggy AML tables.
> >
> >  If the userspace has been prepared to ignore the unreliable "opened" events
> >  and the unreliable initial state notification, Linux users should always
> >  use the following kernel parameter:
> > -B. button.lid_init_state=ignore:
> > +C. button.lid_init_state=ignore:
> >     When this option is specified, the ACPI button driver never reports the
> >     initial lid state and there is a compensation mechanism implemented to
> >     ensure that the reliable "closed" notifications can always be delievered
> > diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> > index 668137e..6d5a8c1 100644
> > --- a/drivers/acpi/button.c
> > +++ b/drivers/acpi/button.c
> > @@ -57,6 +57,7 @@
> >
> >  #define ACPI_BUTTON_LID_INIT_IGNORE	0x00
> >  #define ACPI_BUTTON_LID_INIT_OPEN	0x01
> > +#define ACPI_BUTTON_LID_INIT_METHOD	0x02
> >
> >  #define _COMPONENT		ACPI_BUTTON_COMPONENT
> >  ACPI_MODULE_NAME("button");
> > @@ -376,6 +377,9 @@ static void acpi_lid_initialize_state(struct acpi_device *device)
> >  	case ACPI_BUTTON_LID_INIT_OPEN:
> >  		(void)acpi_lid_notify_state(device, 1);
> >  		break;
> > +	case ACPI_BUTTON_LID_INIT_METHOD:
> > +		(void)acpi_lid_update_state(device);
> > +		break;
> >  	case ACPI_BUTTON_LID_INIT_IGNORE:
> >  	default:
> >  		break;
> > @@ -559,6 +563,9 @@ static int param_set_lid_init_state(const char *val, struct kernel_param *kp)
> >  	if (!strncmp(val, "open", sizeof("open") - 1)) {
> >  		lid_init_state = ACPI_BUTTON_LID_INIT_OPEN;
> >  		pr_info("Notify initial lid state as open\n");
> > +	} else if (!strncmp(val, "method", sizeof("method") - 1)) {
> > +		lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
> > +		pr_info("Notify initial lid state with _LID return value\n");
> >  	} else if (!strncmp(val, "ignore", sizeof("ignore") - 1)) {
> >  		lid_init_state = ACPI_BUTTON_LID_INIT_IGNORE;
> >  		pr_info("Do not notify initial lid state\n");
> > @@ -572,6 +579,8 @@ static int param_get_lid_init_state(char *buffer, struct kernel_param *kp)
> >  	switch (lid_init_state) {
> >  	case ACPI_BUTTON_LID_INIT_OPEN:
> >  		return sprintf(buffer, "open");
> > +	case ACPI_BUTTON_LID_INIT_METHOD:
> > +		return sprintf(buffer, "method");
> >  	case ACPI_BUTTON_LID_INIT_IGNORE:
> >  		return sprintf(buffer, "ignore");
> >  	default:
> > --
> > 2.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/2] acpi/button: revert v4.10 behavior
  2017-05-11  0:09 ` [PATCH 0/2] acpi/button: revert v4.10 behavior Rafael J. Wysocki
@ 2017-05-11  9:33   ` Benjamin Tissoires
  0 siblings, 0 replies; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-11  9:33 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J . Wysocki, Lv Zheng, Jiri Eischmann,
	ACPI Devel Maling List, Linux Kernel Mailing List

Hi Rafael,

On May 11 2017 or thereabouts, Rafael J. Wysocki wrote:
> On Wed, May 10, 2017 at 6:12 PM, Benjamin Tissoires
> <benjamin.tissoires@redhat.com> wrote:
> > The new default 'open' behavior for acpi_lid_initialize_state() is just
> > wrong. It breaks professional laptops with a docking station [1].
> >
> > Booting the laptop with the LID closed is something common and now there
> > is no way of knowing the actual state of the LID switch at boot. Please
> > use a user-space solution as described in 2/2 if you really don't want to
> > add quirks in the kernel.
> >
> > Cheers,
> > Benjamin
> >
> > [1] https://bugzilla.gnome.org/show_bug.cgi?id=782380
> >
> > Benjamin Tissoires (2):
> >   Revert "ACPI / button: Remove lid_init_state=method mode"
> >   Revert "ACPI / button: Change default behavior to lid_init_state=open"
> >
> >  Documentation/acpi/acpi-lid.txt | 16 ++++++++++++----
> >  drivers/acpi/button.c           | 11 ++++++++++-
> >  2 files changed, 22 insertions(+), 5 deletions(-)
> 
> Well, have you seen the recent series
> (http://marc.info/?l=linux-acpi&m=149431335204701&w=2 and the
> following) from Lv?
> 

Yeah, sorry I saw it right after sending mine.

> He evidently agrees with you that "ACPI / button: Remove
> lid_init_state=method mode" should be reverted, but then there are
> differences.

I still believe the second patch should be reverted (I'll try to
convince Lv in the other replies).

> 
> Can you please have a look at that and let me know what's wrong with
> it in your view?  I'd like to have a full picture if poss.

Sure.

A "short" summary could be the following (I'll go in deeper details in
the other replies I'll send):
- it's true that some devices are broken in term of LID switch, but we
  never managed to get an accurate estimate of how many devices are
  affected. Last time we spoke only 3 devices were affected, 1 being 4
  years old or more, one other having a in-kernel solution already.
- the change in the default setting breaks many more devices (all the
  thinkpads from Lenovo, most professional Dell laptops too), and the
  distributions solution would be to set the default setting back to
  "method", so we are back to square one
- changing the default setting to "open" prevents user space to fix the
  kernel by adding heuristics
- the only viable solution is to have a list of problematic devices that
  don't report the correct LID switch state at boot (and later). Whether
  this list is in the kernel or in user-space, I don't care, but the
  current v4.11 code (and probably the second series Lv sent) prevents a
  user-space database to be set.

Cheers,
Benjamin

> 
> Thanks,
> Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-11  0:59   ` Zheng, Lv
@ 2017-05-11  9:45     ` Benjamin Tissoires
  2017-05-12  2:36       ` Zheng, Lv
  0 siblings, 1 reply; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-11  9:45 UTC (permalink / raw)
  To: Zheng, Lv; +Cc: Rafael J . Wysocki, Jiri Eischmann, linux-acpi, linux-kernel

On May 11 2017 or thereabouts, Zheng, Lv wrote:
> Hi,
> 
> > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > Subject: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> > 
> > This reverts commit 77e9a4aa9de10cc1418bf9a892366988802a8025.
> > 
> > Even if the method implementation can be buggy on some platform,
> > the "open" choice is worse. It breaks docking stations basically
> > and there is no way to have a user-space hwdb to fix that.
> > 
> > On the contrary, it's rather easy in user-space to have a hwdb
> > with the problematic platforms. Then, libinput (1.7.0+) can fix
> > the state of the LID switch for us: you need to set the udev
> > property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
> > 
> > When libinput detects internal keyboard events, it will
> > overwrite the state of the switch to open, making it reliable
> > again. Given that logind only checks the LID switch value after
> > a timeout, we can assume the user will use the internal keyboard
> > before this timeout expires.
> > 
> > For example, such a hwdb entry is:
> > 
> > libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
> >  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
> 
> For the reason mentioned previously and proofs here (see patch descriptions):
> https://patchwork.kernel.org/patch/9717111/

I am not sure you can call this proofs. The "close" state is IMO the
exact same as the "method" one, it's just that the intel driver
triggers the evaluation of the _LID method, not acpi/button.c.

And remember that this new default prevents userspace to fix it because
it's rather easy to detect that the device is actually opened (input
events coming from interanl keyboard, touchscreen, or touchpad), while
reporting the lid switch as open means we can't know if the user is
simply not using the internal devices, or if we have just a wrong state.

Given that this patch breaks all Lenovos with a dock (I can guess that 6
lines this year are using docks, and within each line you have 2-5
variants), I'd suggest we do not break those existing laptops and just
revert this patch.

I would think other OEMs have a docking station with an actual power
button but I can't be sure by looking at the product pages.

> We shouldn't do this.

I strongly disagree.

I am fine if you don't want to have a blacklist in the kernel for the
devices that are problematic (the ones that require open), but please
don't prevent user space to have this blacklist and please do not make
false assumptions in the kernel on the state of a switch. This is worse
than it helps and in the end, user space won't be able to solve this if
you change the default behavior at each release.

Cheers,
Benjamin

> 
> Thanks and best regards
> Lv
> 
> > 
> > Link: https://bugzilla.gnome.org/show_bug.cgi?id=782380
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
> > ---
> >  drivers/acpi/button.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> > index 6d5a8c1..e19f530 100644
> > --- a/drivers/acpi/button.c
> > +++ b/drivers/acpi/button.c
> > @@ -113,7 +113,7 @@ struct acpi_button {
> > 
> >  static BLOCKING_NOTIFIER_HEAD(acpi_lid_notifier);
> >  static struct acpi_device *lid_device;
> > -static u8 lid_init_state = ACPI_BUTTON_LID_INIT_OPEN;
> > +static u8 lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
> > 
> >  static unsigned long lid_report_interval __read_mostly = 500;
> >  module_param(lid_report_interval, ulong, 0644);
> > --
> > 2.9.3
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
  2017-05-11  1:19     ` Zheng, Lv
@ 2017-05-11 10:12       ` Benjamin Tissoires
  2017-05-12  5:08         ` Zheng, Lv
  0 siblings, 1 reply; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-11 10:12 UTC (permalink / raw)
  To: Zheng, Lv; +Cc: Rafael J . Wysocki, Jiri Eischmann, linux-acpi, linux-kernel

On May 11 2017 or thereabouts, Zheng, Lv wrote:
> Hi, Benjamin
> 
> > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Zheng,
> > Lv
> > Subject: RE: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
> > 
> > Hi, Benjiamin
> > 
> > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > > Sent: Thursday, May 11, 2017 12:13 AM
> > > To: Rafael J . Wysocki <rjw@rjwysocki.net>; Zheng, Lv <lv.zheng@intel.com>
> > > Cc: Jiri Eischmann <jeischma@redhat.com>; linux-acpi@vger.kernel.org; linux-kernel@vger.kernel.org
> > > Subject: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
> > >
> > > This reverts commit ecb10b694b72ca5ea51b3c90a71ff2a11963425a.
> > >
> > > Even if the method can be buggy some times, it's still a need
> > > when you boot a laptop with a lid closed and an external monitor
> > > connected (typical situation when using a docking station)
> > >
> > > Note: this option has been removed without being deprecated, which
> > > is terrible in term of distribution handling. The new default "open"
> > > is plain wrong and we don't even have the chance to keep the current
> > > default option because it's not there anymore.
> > 
> > I have reverted this:
> > https://patchwork.kernel.org/patch/9717109/

Yeah, sorry, as mentioned to Rafael, I only saw it after I sent my
series.

> > 
> > Also I noticed one thing:
> > https://patchwork.kernel.org/patch/9717111/
> > 
> > After I changed kernel lid notification to always send lid return value to other drivers.
> > In order to find out a single driver mode (without platform quirks) to handle both cases.
> > I failed.

Yes. As long as you do not have more information on the device, this is
not something you can solve with hacks in the kernel.

> > I should still send close init state to the user space program to work around the external monitor
> > issues.
> > 
> > So as you know, we need to send "open" init state to the user space program to work around
> > suspend/resume loop issue.

I disagree. You are solving a user space issue (suspend boot loop), with
the wrong tool. The user space expects the kernel to provide accurate
events, but if the hardware is wrong, we should either fix it
(overwriting AML tables), make reasonable assumptions based on the exact
model capabilities (is the power button accessible with the LID closed),
or teach user space about these situations.
There is no point in assuming wrong states in the kernel "to fix user
space" when user space is obviously not doing the right thing.

> > 
> > Then for such platforms, how can ACPI button driver automatically detect if an external monitor is
> > there?

That's not the ACPI button role.
However, user space can write to the switch to overwrite it's value when
it judges that the kernel is doing things wrong. Libinput is a pretty
good candidate, given it has a view of all input devices. And guess
what, the code is already there.

> > Unless we fix the BIOS code to make lid return value work as user space's expectation.

That would be great.

> > OK, then this creates an endless business in ACPI community to "re-develop" BIOS tables if they cannot
> > meat user space's expectation.
> > That sucks.

Yes, but unless you teach OEM to not do crap, that's our daily burden.
I'd love to not have to quirk endlessly all the drivers I maintain, but
each generation of new devices has a new creative way of breaking the
existing code, "because it works under Windows".

> > 
> > It sound the best way is the user space program equipped with hwdb quirks who knows everything to
> > alter acpi button driver mode from user space to work around this.

Yes, the hwdb entry is the solution (or a quirk list in acpi/button.c).
The advantage of the hwdb entry is that it will be asynchronous from the
kernel releases, and users can just drop a file in their /etc folder and
they solved the issue. Distribution will also be able to carry the list
of quirked devices, and hopefully end users won't see the boot loop.

> > For example:
> > 
> > If hwdb is hit, and there is no external monitor, then
> >  Echo "open" > /sys/module/button/parameters/lid_init_state
> > If hwdb is not hit or there is an external monitor, then
> > If hwdb is hit, and there is no external monitor, then
> >  Echo "close" > /sys/module/button/parameters/lid_init_state

Hum, no. This is too late. acpi/button.c is loaded before udev hits, so
the initial state will already be evaluated.

> 
> Let me do re-wording.
> If hwdb is not hit
>   echo "method" > /sys/module/button/parameters/lid_init_state
> If hwdb is hit, and there is no external monitor, then
>   echo "open" > /sys/module/button/parameters/lid_init_state
> If hwdb is hit, and there is an external monitor
>   echo "close" > /sys/module/button/parameters/lid_init_state
> Then this always assumes the hard requirements of the platform quirks.
> And it then looks it's better to do such switches in the user space as ACPI button driver doesn't know and doesn't have to know the existence of the external monitor.

Again, the external monitor doesn't matter here. The external monitor
issue is a user space choice to:
- not suspend if the LID is closed and a monitor is plugged
- only show the greater on the internal monitor if both are turned on.

The issue we are fixing here is the fact that the switch state is wrong,
which makes user space assumptions wrong too (and users angry).

But given that the LID switch is an actual input switch device, user
space can overwrite it by simply writing to the input node.


> 
> However, is that possible to not introduce platform quirks?
> For example, for systemd:
> If it detected ACPI lid device, automatically switch to an "IgnoreLidOpenEvent" mode (like nouveau drivers' ignorelid=Y option).

Well, given that 99.9% of laptops have this ACPI lid button, you'll just
remove the feature from all laptops. 

> BTW, which program is responsible for lighting on internal/external monitors?

I would say the compositor or X, so gdm, kdm, gnome, kde, etc...

> Can it determine that by its own without listening to the lid key events?

Basically no. This switch is there for a reason. However, I am convinced
that a good heuristic is to say that if you are using the internal keyboard,
touchscreen or touchpad, unless the user has very very thin fingers, the
lid is open.

> This is what we preferred.
> If all of above usage models are corrected, we'll change acpi button driver default mode to "ignore".

No, we need to report accurate state, or explicitly mark the platform as
not reliable, and so we need "method" and a hwdb of problematic ones.

> 
> Another problem for changing default mode back to "method" is:
> If we changed button driver default mode back to "method", then ACPI community will be flooded of suspend/resume loop bug reports.

But that's your job to fix bugs. If there is a user space problem that
can be solved in user space, you just need to redirect the users to the
correct solution and close the bug.

But you are talking about "flood", and I don't think we ever talked
about more than 4 devices. So could you point me at a list of bugs that
you actually had to fix?

> After we root cause that's not a kernel problem, do we have mean in other community to handle such reports?
> For example, to collect hwdb entries.

libinput, systemd are good candidate.
Libinput already has the bits in place, so I'd say we should probably
ask the users to report a bug on the wayland/libinput component of
bugs.freedesktop.org. But this will only work if the default initial lid
state is "method".

Sorry for showing I am angry. But I thought we solved this months ago
and this bites back.

Cheers,
Benjamin

> 
> Thanks and best regards
> Lv
> 
> > 
> > So PATCH 2 is not useful.
> > Reverting that can trigger a regression loop we surely do not want to handle.
> > 
> > Thanks and best regards
> > Lv
> > 
> > >
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
> > > ---
> > >  Documentation/acpi/acpi-lid.txt | 16 ++++++++++++----
> > >  drivers/acpi/button.c           |  9 +++++++++
> > >  2 files changed, 21 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/Documentation/acpi/acpi-lid.txt b/Documentation/acpi/acpi-lid.txt
> > > index 22cb309..effe7af 100644
> > > --- a/Documentation/acpi/acpi-lid.txt
> > > +++ b/Documentation/acpi/acpi-lid.txt
> > > @@ -59,20 +59,28 @@ button driver uses the following 3 modes in order not to trigger issues.
> > >  If the userspace hasn't been prepared to ignore the unreliable "opened"
> > >  events and the unreliable initial state notification, Linux users can use
> > >  the following kernel parameters to handle the possible issues:
> > > -A. button.lid_init_state=open:
> > > +A. button.lid_init_state=method:
> > > +   When this option is specified, the ACPI button driver reports the
> > > +   initial lid state using the returning value of the _LID control method
> > > +   and whether the "opened"/"closed" events are paired fully relies on the
> > > +   firmware implementation.
> > > +   This option can be used to fix some platforms where the returning value
> > > +   of the _LID control method is reliable but the initial lid state
> > > +   notification is missing.
> > > +   This option is the default behavior during the period the userspace
> > > +   isn't ready to handle the buggy AML tables.
> > > +B. button.lid_init_state=open:
> > >     When this option is specified, the ACPI button driver always reports the
> > >     initial lid state as "opened" and whether the "opened"/"closed" events
> > >     are paired fully relies on the firmware implementation.
> > >     This may fix some platforms where the returning value of the _LID
> > >     control method is not reliable and the initial lid state notification is
> > >     missing.
> > > -   This option is the default behavior during the period the userspace
> > > -   isn't ready to handle the buggy AML tables.
> > >
> > >  If the userspace has been prepared to ignore the unreliable "opened" events
> > >  and the unreliable initial state notification, Linux users should always
> > >  use the following kernel parameter:
> > > -B. button.lid_init_state=ignore:
> > > +C. button.lid_init_state=ignore:
> > >     When this option is specified, the ACPI button driver never reports the
> > >     initial lid state and there is a compensation mechanism implemented to
> > >     ensure that the reliable "closed" notifications can always be delievered
> > > diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> > > index 668137e..6d5a8c1 100644
> > > --- a/drivers/acpi/button.c
> > > +++ b/drivers/acpi/button.c
> > > @@ -57,6 +57,7 @@
> > >
> > >  #define ACPI_BUTTON_LID_INIT_IGNORE	0x00
> > >  #define ACPI_BUTTON_LID_INIT_OPEN	0x01
> > > +#define ACPI_BUTTON_LID_INIT_METHOD	0x02
> > >
> > >  #define _COMPONENT		ACPI_BUTTON_COMPONENT
> > >  ACPI_MODULE_NAME("button");
> > > @@ -376,6 +377,9 @@ static void acpi_lid_initialize_state(struct acpi_device *device)
> > >  	case ACPI_BUTTON_LID_INIT_OPEN:
> > >  		(void)acpi_lid_notify_state(device, 1);
> > >  		break;
> > > +	case ACPI_BUTTON_LID_INIT_METHOD:
> > > +		(void)acpi_lid_update_state(device);
> > > +		break;
> > >  	case ACPI_BUTTON_LID_INIT_IGNORE:
> > >  	default:
> > >  		break;
> > > @@ -559,6 +563,9 @@ static int param_set_lid_init_state(const char *val, struct kernel_param *kp)
> > >  	if (!strncmp(val, "open", sizeof("open") - 1)) {
> > >  		lid_init_state = ACPI_BUTTON_LID_INIT_OPEN;
> > >  		pr_info("Notify initial lid state as open\n");
> > > +	} else if (!strncmp(val, "method", sizeof("method") - 1)) {
> > > +		lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
> > > +		pr_info("Notify initial lid state with _LID return value\n");
> > >  	} else if (!strncmp(val, "ignore", sizeof("ignore") - 1)) {
> > >  		lid_init_state = ACPI_BUTTON_LID_INIT_IGNORE;
> > >  		pr_info("Do not notify initial lid state\n");
> > > @@ -572,6 +579,8 @@ static int param_get_lid_init_state(char *buffer, struct kernel_param *kp)
> > >  	switch (lid_init_state) {
> > >  	case ACPI_BUTTON_LID_INIT_OPEN:
> > >  		return sprintf(buffer, "open");
> > > +	case ACPI_BUTTON_LID_INIT_METHOD:
> > > +		return sprintf(buffer, "method");
> > >  	case ACPI_BUTTON_LID_INIT_IGNORE:
> > >  		return sprintf(buffer, "ignore");
> > >  	default:
> > > --
> > > 2.9.3
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-11  9:45     ` Benjamin Tissoires
@ 2017-05-12  2:36       ` Zheng, Lv
  2017-05-12 21:50         ` Rafael J. Wysocki
  0 siblings, 1 reply; 38+ messages in thread
From: Zheng, Lv @ 2017-05-12  2:36 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Rafael J . Wysocki, Jiri Eischmann, linux-acpi, linux-kernel

Hi, 

> From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> 
> On May 11 2017 or thereabouts, Zheng, Lv wrote:
> > Hi,
> >
> > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > > Subject: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> > >
> > > This reverts commit 77e9a4aa9de10cc1418bf9a892366988802a8025.
> > >
> > > Even if the method implementation can be buggy on some platform,
> > > the "open" choice is worse. It breaks docking stations basically
> > > and there is no way to have a user-space hwdb to fix that.
> > >
> > > On the contrary, it's rather easy in user-space to have a hwdb
> > > with the problematic platforms. Then, libinput (1.7.0+) can fix
> > > the state of the LID switch for us: you need to set the udev
> > > property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
> > >
> > > When libinput detects internal keyboard events, it will
> > > overwrite the state of the switch to open, making it reliable
> > > again. Given that logind only checks the LID switch value after
> > > a timeout, we can assume the user will use the internal keyboard
> > > before this timeout expires.
> > >
> > > For example, such a hwdb entry is:
> > >
> > > libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
> > >  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
> >
> > For the reason mentioned previously and proofs here (see patch descriptions):
> > https://patchwork.kernel.org/patch/9717111/
> 
> I am not sure you can call this proofs. The "close" state is IMO the
> exact same as the "method" one, it's just that the intel driver
> triggers the evaluation of the _LID method, not acpi/button.c.

I should correct you that:
1. intel_lvds driver only evaluates _LID method for its own purpose,
   See my reply to PATCH 4-5;
2. acpi/button.c is still responsible for generating the lid input event (to input layer and thus the userspace)
   And the input event generated by button.c differs for the 2 modes.
   See my another reply to PATCH 02.

> 
> And remember that this new default prevents userspace to fix it because
> it's rather easy to detect that the device is actually opened (input
> events coming from interanl keyboard, touchscreen, or touchpad), while
> reporting the lid switch as open means we can't know if the user is
> simply not using the internal devices, or if we have just a wrong state.

That depends on what the meaning of "fix" is IMO.
I saw you wrote a lot in another message, let's discuss this there.

> 
> Given that this patch breaks all Lenovos with a dock (I can guess that 6
> lines this year are using docks, and within each line you have 2-5
> variants), I'd suggest we do not break those existing laptops and just
> revert this patch.
> 
> I would think other OEMs have a docking station with an actual power
> button but I can't be sure by looking at the product pages.

I'm not sure if it breaks the external monitor usage model.
Why don't the userspace just
1. always light on the external display and keep the internal display not lit on after boot/resume, or
2. don't do anything and let the BIOS to light on the right display, or
3. don't do anything and let the graphics drivers to light on the right display (using saved state when suspends and resumes to that saved state).

Why such decision have to be made based on ACPI control method lid device?
I cannot see a strong reason that ACPI control method lid device must participate in this usage model.

See drivers/gpu/drm/nouveau/nouveau_connector.c,
the ignorelid parameter proves that the graphics drivers can do this on their own and do this perfectly.
 
> 
> > We shouldn't do this.
> 
> I strongly disagree.
> 
> I am fine if you don't want to have a blacklist in the kernel for the
> devices that are problematic (the ones that require open), but please
> don't prevent user space to have this blacklist and please do not make
> false assumptions in the kernel on the state of a switch. This is worse
> than it helps and in the end, user space won't be able to solve this if
> you change the default behavior at each release.

We are actually also fine with any default value.

But be aware of that, ACPI subsystem just provides a button driver:
1. Allows users to suspend the system upon receiving "close" lid event;
2. Stops to support any other usage models but is willing to provide tunable behavior for the other system participants to support their special needs.
   Such participants include but are not limited to the user space tools and the other kernel modules.
   But the default behavior and the timing of tuning button driver behaviors should be determined by the other system participants.
   And as a consequence, the other system participants who change that must be responsible for fixing regressions related to conflicts of the needs.
   As such, we need to know some official bug report window to forward bug reports related to that.
   Otherwise, ACPI Bugzilla can be flooded by wrong bugs.

Cheers,
Lv

> 
> Cheers,
> Benjamin
> 
> >
> > Thanks and best regards
> > Lv
> >
> > >
> > > Link: https://bugzilla.gnome.org/show_bug.cgi?id=782380
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
> > > ---
> > >  drivers/acpi/button.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> > > index 6d5a8c1..e19f530 100644
> > > --- a/drivers/acpi/button.c
> > > +++ b/drivers/acpi/button.c
> > > @@ -113,7 +113,7 @@ struct acpi_button {
> > >
> > >  static BLOCKING_NOTIFIER_HEAD(acpi_lid_notifier);
> > >  static struct acpi_device *lid_device;
> > > -static u8 lid_init_state = ACPI_BUTTON_LID_INIT_OPEN;
> > > +static u8 lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
> > >
> > >  static unsigned long lid_report_interval __read_mostly = 500;
> > >  module_param(lid_report_interval, ulong, 0644);
> > > --
> > > 2.9.3
> >

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
  2017-05-11 10:12       ` Benjamin Tissoires
@ 2017-05-12  5:08         ` Zheng, Lv
  2017-05-12  9:50           ` Benjamin Tissoires
  0 siblings, 1 reply; 38+ messages in thread
From: Zheng, Lv @ 2017-05-12  5:08 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Rafael J . Wysocki, Jiri Eischmann, linux-acpi, linux-kernel

Hi, 

> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Benjamin
> Tissoires
> Subject: Re: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
> 
> On May 11 2017 or thereabouts, Zheng, Lv wrote:
> > Hi, Benjamin
> >
> > > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of
> Zheng,
> > > Lv
> > > Subject: RE: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
> > >
> > > Hi, Benjiamin
> > >
> > > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > > > Sent: Thursday, May 11, 2017 12:13 AM
> > > > To: Rafael J . Wysocki <rjw@rjwysocki.net>; Zheng, Lv <lv.zheng@intel.com>
> > > > Cc: Jiri Eischmann <jeischma@redhat.com>; linux-acpi@vger.kernel.org; linux-
> kernel@vger.kernel.org
> > > > Subject: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
> > > >
> > > > This reverts commit ecb10b694b72ca5ea51b3c90a71ff2a11963425a.
> > > >
> > > > Even if the method can be buggy some times, it's still a need
> > > > when you boot a laptop with a lid closed and an external monitor
> > > > connected (typical situation when using a docking station)
> > > >
> > > > Note: this option has been removed without being deprecated, which
> > > > is terrible in term of distribution handling. The new default "open"
> > > > is plain wrong and we don't even have the chance to keep the current
> > > > default option because it's not there anymore.
> > >
> > > I have reverted this:
> > > https://patchwork.kernel.org/patch/9717109/
> 
> Yeah, sorry, as mentioned to Rafael, I only saw it after I sent my
> series.
> 
> > >
> > > Also I noticed one thing:
> > > https://patchwork.kernel.org/patch/9717111/
> > >
> > > After I changed kernel lid notification to always send lid return value to other drivers.
> > > In order to find out a single driver mode (without platform quirks) to handle both cases.
> > > I failed.
> 
> Yes. As long as you do not have more information on the device, this is
> not something you can solve with hacks in the kernel.
> 
> > > I should still send close init state to the user space program to work around the external monitor
> > > issues.
> > >
> > > So as you know, we need to send "open" init state to the user space program to work around
> > > suspend/resume loop issue.
> 
> I disagree. You are solving a user space issue (suspend boot loop), with
> the wrong tool.

I'm not trying to solve it.
I'm trying to confirm, it’s the input event not the lid notifier event triggered the problem.

> The user space expects the kernel to provide accurate
> events, but if the hardware is wrong, we should either fix it
> (overwriting AML tables),

I don't think it's hardware related.
If it's hardware related, we should always be able to fix it.

IMO, it's firmware related.
And the firmware only provides functionalities Windows requires.
It cares no about what the Linux user space requires.
We have no mean to prevent future firmwares from breaking Linux user space again.
So if it trends to break in AML, introducing quirks to fix the old AMLs doesn't help to solve the problem.

> make reasonable assumptions based on the exact
> model capabilities (is the power button accessible with the LID closed),

Sounds it's always accessible.

> or teach user space about these situations.

We are trying.
But if the end users don't buy it, no user space programs will be interested in looking into the details of these issues.
So we need to teach end users first IMO.
Telling them to raise the issues to the right responsibles rather than acpi button driver.

Some end users even make reports according to some BKMs (it has been changed now after discussed on ACPI Bugzilla bug):
https://wiki.archlinux.org/index.php/Razer#Suspend_Loop

> There is no point in assuming wrong states in the kernel "to fix user
> space" when user space is obviously not doing the right thing.

Agreed, as we've already synchronized. :)

But the only problem is:
We can see flooded ACPI button driver problems reported on kernel Bugzilla.
That's why we changed the default behavior of button driver.
It really helped to mute all suspend/resume loop issue reports.

> 
> > >
> > > Then for such platforms, how can ACPI button driver automatically detect if an external monitor is
> > > there?
> 
> That's not the ACPI button role.
> However, user space can write to the switch to overwrite it's value when
> it judges that the kernel is doing things wrong. Libinput is a pretty
> good candidate, given it has a view of all input devices. And guess
> what, the code is already there.

You only told us the way the user space should use to fix the issue.
But didn't tell us who should be responsible of fixing this issue.

> 
> > > Unless we fix the BIOS code to make lid return value work as user space's expectation.
> 
> That would be great.

That's not the real fix.
Just another kind of work around and cannot prevent future bug reports.

> 
> > > OK, then this creates an endless business in ACPI community to "re-develop" BIOS tables if they
> cannot
> > > meat user space's expectation.
> > > That sucks.
> 
> Yes, but unless you teach OEM to not do crap, that's our daily burden.
> I'd love to not have to quirk endlessly all the drivers I maintain, but
> each generation of new devices has a new creative way of breaking the
> existing code, "because it works under Windows".

Or we can teach user space to not do any expectation other than what Windows supports on OEMs.

> 
> > >
> > > It sound the best way is the user space program equipped with hwdb quirks who knows everything to
> > > alter acpi button driver mode from user space to work around this.
> 
> Yes, the hwdb entry is the solution (or a quirk list in acpi/button.c).

I'd rather to see such quirks accumulated in the user side not in the button driver.

> The advantage of the hwdb entry is that it will be asynchronous from the
> kernel releases, and users can just drop a file in their /etc folder and
> they solved the issue. Distribution will also be able to carry the list
> of quirked devices, and hopefully end users won't see the boot loop.

The advantage sounds good.

> 
> > > For example:
> > >
> > > If hwdb is hit, and there is no external monitor, then
> > >  Echo "open" > /sys/module/button/parameters/lid_init_state
> > > If hwdb is not hit or there is an external monitor, then
> > > If hwdb is hit, and there is no external monitor, then
> > >  Echo "close" > /sys/module/button/parameters/lid_init_state
> 
> Hum, no. This is too late. acpi/button.c is loaded before udev hits, so
> the initial state will already be evaluated.

But this can affect post-resume behavior.
And the bug reports are related to the post-resume behaviors.

> 
> >
> > Let me do re-wording.
> > If hwdb is not hit
> >   echo "method" > /sys/module/button/parameters/lid_init_state
> > If hwdb is hit, and there is no external monitor, then
> >   echo "open" > /sys/module/button/parameters/lid_init_state
> > If hwdb is hit, and there is an external monitor
> >   echo "close" > /sys/module/button/parameters/lid_init_state
> > Then this always assumes the hard requirements of the platform quirks.
> > And it then looks it's better to do such switches in the user space as ACPI button driver doesn't
> know and doesn't have to know the existence of the external monitor.
> 
> Again, the external monitor doesn't matter here. The external monitor
> issue is a user space choice to:
> - not suspend if the LID is closed and a monitor is plugged
> - only show the greater on the internal monitor if both are turned on.

Which program turned the lid off when the lid was closed?

> 
> The issue we are fixing here is the fact that the switch state is wrong,
> which makes user space assumptions wrong too (and users angry).

Considering no platform quirks.

If ACPI button driver sends SW_LID, users are likely angry.
Unless the user space programs are changed to "ignore open event".

If ACPI button driver doesn't send switch events, but key events.
The user space programs need to change to handle the new events.

So finally whatever we do, user space need to change a bit related to ACPI control method lid device.

> 
> But given that the LID switch is an actual input switch device, user
> space can overwrite it by simply writing to the input node.

I can see what you mean here.
You are suggesting to use input overwrite rather than changing lid_init_state to fix the issue.

This also means to me: user space is able to fix everything on its own, ACPI button driver needn't participant in.

> 
> 
> >
> > However, is that possible to not introduce platform quirks?
> > For example, for systemd:
> > If it detected ACPI lid device, automatically switch to an "IgnoreLidOpenEvent" mode (like nouveau
> drivers' ignorelid=Y option).
> 
> Well, given that 99.9% of laptops have this ACPI lid button, you'll just
> remove the feature from all laptops.

No, I only removed the wrong usage models related to the strict "open" event.
All laptops are still capable of sending correct "close" event.

> 
> > BTW, which program is responsible for lighting on internal/external monitors?
> 
> I would say the compositor or X, so gdm, kdm, gnome, kde, etc...

So I already raised this to freedesktop:
https://bugs.freedesktop.org/show_bug.cgi?id=100923
But couldn't see any one there responding.

> 
> > Can it determine that by its own without listening to the lid key events?
> 
> Basically no. This switch is there for a reason. However, I am convinced
> that a good heuristic is to say that if you are using the internal keyboard,
> touchscreen or touchpad, unless the user has very very thin fingers, the
> lid is open.

I'm also convinced that the benefit of having a file in sysfs/procfs to allow user to know if the lid is open is marginal.

> 
> > This is what we preferred.
> > If all of above usage models are corrected, we'll change acpi button driver default mode to "ignore".
> 
> No, we need to report accurate state, or explicitly mark the platform as
> not reliable, and so we need "method" and a hwdb of problematic ones.

I'm actually OK with any default value.
But like previously I don't want to send a patch to change the default behavior to "open".
I don't want to be the one to change the default behavior back to "method".

> 
> >
> > Another problem for changing default mode back to "method" is:
> > If we changed button driver default mode back to "method", then ACPI community will be flooded of
> suspend/resume loop bug reports.
> 
> But that's your job to fix bugs. If there is a user space problem that
> can be solved in user space, you just need to redirect the users to the
> correct solution and close the bug.
> 
> But you are talking about "flood", and I don't think we ever talked
> about more than 4 devices. So could you point me at a list of bugs that
> you actually had to fix?

I just opened my Bugzilla filter, and copied top most lid related ones here:
https://bugzilla.kernel.org/show_bug.cgi?id=187271
https://bugzilla.kernel.org/show_bug.cgi?id=192231
https://bugzilla.kernel.org/show_bug.cgi?id=191211
https://bugzilla.kernel.org/show_bug.cgi?id=189171
https://bugzilla.kernel.org/show_bug.cgi?id=116001
...

And old reports here, occasionally opened by different users:
https://bugzilla.kernel.org/show_bug.cgi?id=89211
https://bugzilla.kernel.org/show_bug.cgi?id=106151
https://bugzilla.kernel.org/show_bug.cgi?id=106941
https://bugzilla.novell.com/show_bug.cgi?id=326814

Do you think the kernel community should prepare a candidate to handle such bug flood?
I'm unfortunately the one changed button driver recently and have to handle them.
That's why I changed default behavior from method to open.
After upstream merged that default behavior change, no such reports could be seen again.

But it's just 1-2 months Bugzilla silence before seeing a different bug flood trend:
This time, they are related to the external monitor usage model:
https://bugzilla.kernel.org/show_bug.cgi?id=187271
https://bugzilla.redhat.com/show_bug.cgi?id=1430259
https://bugzilla.kernel.org/show_bug.cgi?id=195455
Now that you can see why I didn't send a patch to change the default behavior to "open" at the time we were discussing.
That's also why I think we needn't revert back to "method" as the default behavior.

I'm afraid you may need to be prepared to handle some of the reports if you revert the default behavior back to "method".
So if you insist, I can put my Acked-by to your PATCH 2/2.

> 
> > After we root cause that's not a kernel problem, do we have mean in other community to handle such
> reports?
> > For example, to collect hwdb entries.
> 
> libinput, systemd are good candidate.
> Libinput already has the bits in place, so I'd say we should probably
> ask the users to report a bug on the wayland/libinput component of
> bugs.freedesktop.org. But this will only work if the default initial lid
> state is "method".

Good to know. :)
I've just changed the category of the forwarded report.

> 
> Sorry for showing I am angry. But I thought we solved this months ago
> and this bites back.

You always help me a lot, appreciated!

Cheers
Lv

> 
> Cheers,
> Benjamin
> 
> >
> > Thanks and best regards
> > Lv
> >
> > >
> > > So PATCH 2 is not useful.
> > > Reverting that can trigger a regression loop we surely do not want to handle.
> > >
> > > Thanks and best regards
> > > Lv
> > >
> > > >
> > > > Cc: stable@vger.kernel.org
> > > > Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
> > > > ---
> > > >  Documentation/acpi/acpi-lid.txt | 16 ++++++++++++----
> > > >  drivers/acpi/button.c           |  9 +++++++++
> > > >  2 files changed, 21 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/Documentation/acpi/acpi-lid.txt b/Documentation/acpi/acpi-lid.txt
> > > > index 22cb309..effe7af 100644
> > > > --- a/Documentation/acpi/acpi-lid.txt
> > > > +++ b/Documentation/acpi/acpi-lid.txt
> > > > @@ -59,20 +59,28 @@ button driver uses the following 3 modes in order not to trigger issues.
> > > >  If the userspace hasn't been prepared to ignore the unreliable "opened"
> > > >  events and the unreliable initial state notification, Linux users can use
> > > >  the following kernel parameters to handle the possible issues:
> > > > -A. button.lid_init_state=open:
> > > > +A. button.lid_init_state=method:
> > > > +   When this option is specified, the ACPI button driver reports the
> > > > +   initial lid state using the returning value of the _LID control method
> > > > +   and whether the "opened"/"closed" events are paired fully relies on the
> > > > +   firmware implementation.
> > > > +   This option can be used to fix some platforms where the returning value
> > > > +   of the _LID control method is reliable but the initial lid state
> > > > +   notification is missing.
> > > > +   This option is the default behavior during the period the userspace
> > > > +   isn't ready to handle the buggy AML tables.
> > > > +B. button.lid_init_state=open:
> > > >     When this option is specified, the ACPI button driver always reports the
> > > >     initial lid state as "opened" and whether the "opened"/"closed" events
> > > >     are paired fully relies on the firmware implementation.
> > > >     This may fix some platforms where the returning value of the _LID
> > > >     control method is not reliable and the initial lid state notification is
> > > >     missing.
> > > > -   This option is the default behavior during the period the userspace
> > > > -   isn't ready to handle the buggy AML tables.
> > > >
> > > >  If the userspace has been prepared to ignore the unreliable "opened" events
> > > >  and the unreliable initial state notification, Linux users should always
> > > >  use the following kernel parameter:
> > > > -B. button.lid_init_state=ignore:
> > > > +C. button.lid_init_state=ignore:
> > > >     When this option is specified, the ACPI button driver never reports the
> > > >     initial lid state and there is a compensation mechanism implemented to
> > > >     ensure that the reliable "closed" notifications can always be delievered
> > > > diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> > > > index 668137e..6d5a8c1 100644
> > > > --- a/drivers/acpi/button.c
> > > > +++ b/drivers/acpi/button.c
> > > > @@ -57,6 +57,7 @@
> > > >
> > > >  #define ACPI_BUTTON_LID_INIT_IGNORE	0x00
> > > >  #define ACPI_BUTTON_LID_INIT_OPEN	0x01
> > > > +#define ACPI_BUTTON_LID_INIT_METHOD	0x02
> > > >
> > > >  #define _COMPONENT		ACPI_BUTTON_COMPONENT
> > > >  ACPI_MODULE_NAME("button");
> > > > @@ -376,6 +377,9 @@ static void acpi_lid_initialize_state(struct acpi_device *device)
> > > >  	case ACPI_BUTTON_LID_INIT_OPEN:
> > > >  		(void)acpi_lid_notify_state(device, 1);
> > > >  		break;
> > > > +	case ACPI_BUTTON_LID_INIT_METHOD:
> > > > +		(void)acpi_lid_update_state(device);
> > > > +		break;
> > > >  	case ACPI_BUTTON_LID_INIT_IGNORE:
> > > >  	default:
> > > >  		break;
> > > > @@ -559,6 +563,9 @@ static int param_set_lid_init_state(const char *val, struct kernel_param *kp)
> > > >  	if (!strncmp(val, "open", sizeof("open") - 1)) {
> > > >  		lid_init_state = ACPI_BUTTON_LID_INIT_OPEN;
> > > >  		pr_info("Notify initial lid state as open\n");
> > > > +	} else if (!strncmp(val, "method", sizeof("method") - 1)) {
> > > > +		lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
> > > > +		pr_info("Notify initial lid state with _LID return value\n");
> > > >  	} else if (!strncmp(val, "ignore", sizeof("ignore") - 1)) {
> > > >  		lid_init_state = ACPI_BUTTON_LID_INIT_IGNORE;
> > > >  		pr_info("Do not notify initial lid state\n");
> > > > @@ -572,6 +579,8 @@ static int param_get_lid_init_state(char *buffer, struct kernel_param *kp)
> > > >  	switch (lid_init_state) {
> > > >  	case ACPI_BUTTON_LID_INIT_OPEN:
> > > >  		return sprintf(buffer, "open");
> > > > +	case ACPI_BUTTON_LID_INIT_METHOD:
> > > > +		return sprintf(buffer, "method");
> > > >  	case ACPI_BUTTON_LID_INIT_IGNORE:
> > > >  		return sprintf(buffer, "ignore");
> > > >  	default:
> > > > --
> > > > 2.9.3
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
  2017-05-12  5:08         ` Zheng, Lv
@ 2017-05-12  9:50           ` Benjamin Tissoires
  2017-05-15  4:54               ` Zheng, Lv
  0 siblings, 1 reply; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-12  9:50 UTC (permalink / raw)
  To: Zheng, Lv; +Cc: Rafael J . Wysocki, Jiri Eischmann, linux-acpi, linux-kernel

On May 12 2017 or thereabouts, Zheng, Lv wrote:
> Hi, 
> 
> > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Benjamin
> > Tissoires
> > Subject: Re: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
> > 
> > On May 11 2017 or thereabouts, Zheng, Lv wrote:
> > > Hi, Benjamin
> > >
> > > > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of
> > Zheng,
> > > > Lv
> > > > Subject: RE: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
> > > >
> > > > Hi, Benjiamin
> > > >
> > > > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > > > > Sent: Thursday, May 11, 2017 12:13 AM
> > > > > To: Rafael J . Wysocki <rjw@rjwysocki.net>; Zheng, Lv <lv.zheng@intel.com>
> > > > > Cc: Jiri Eischmann <jeischma@redhat.com>; linux-acpi@vger.kernel.org; linux-
> > kernel@vger.kernel.org
> > > > > Subject: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
> > > > >
> > > > > This reverts commit ecb10b694b72ca5ea51b3c90a71ff2a11963425a.
> > > > >
> > > > > Even if the method can be buggy some times, it's still a need
> > > > > when you boot a laptop with a lid closed and an external monitor
> > > > > connected (typical situation when using a docking station)
> > > > >
> > > > > Note: this option has been removed without being deprecated, which
> > > > > is terrible in term of distribution handling. The new default "open"
> > > > > is plain wrong and we don't even have the chance to keep the current
> > > > > default option because it's not there anymore.
> > > >
> > > > I have reverted this:
> > > > https://patchwork.kernel.org/patch/9717109/
> > 
> > Yeah, sorry, as mentioned to Rafael, I only saw it after I sent my
> > series.
> > 
> > > >
> > > > Also I noticed one thing:
> > > > https://patchwork.kernel.org/patch/9717111/
> > > >
> > > > After I changed kernel lid notification to always send lid return value to other drivers.
> > > > In order to find out a single driver mode (without platform quirks) to handle both cases.
> > > > I failed.
> > 
> > Yes. As long as you do not have more information on the device, this is
> > not something you can solve with hacks in the kernel.
> > 
> > > > I should still send close init state to the user space program to work around the external monitor
> > > > issues.
> > > >
> > > > So as you know, we need to send "open" init state to the user space program to work around
> > > > suspend/resume loop issue.
> > 
> > I disagree. You are solving a user space issue (suspend boot loop), with
> > the wrong tool.
> 
> I'm not trying to solve it.
> I'm trying to confirm, it’s the input event not the lid notifier event triggered the problem.

You would have asked, I would have answered without you spending too
much time figuring this out.

The reason is simple: the kernel should report an accurate state, no
matter what.
Given this statement, user space made some design decisions, and in that
situation, the design differs from the Windows behavior. Which means OEM
only check Windows and might break Linux.

End of the story. The only viable way of not having boot/resume loops is
to report accurate events/states, and for that you need a way to fix it.

> 
> > The user space expects the kernel to provide accurate
> > events, but if the hardware is wrong, we should either fix it
> > (overwriting AML tables),
> 
> I don't think it's hardware related.
> If it's hardware related, we should always be able to fix it.

Well, yeah, hardware/firmware, for me it's all before the kernel.
And hardware would mean an actual failure of the sensor, so definitely
not something we can fix at the software level.

> 
> IMO, it's firmware related.
> And the firmware only provides functionalities Windows requires.
> It cares no about what the Linux user space requires.
> We have no mean to prevent future firmwares from breaking Linux user space again.

agree.

> So if it trends to break in AML, introducing quirks to fix the old AMLs doesn't help to solve the problem.
> 
> > make reasonable assumptions based on the exact
> > model capabilities (is the power button accessible with the LID closed),
> 
> Sounds it's always accessible.

hmm, I am not sure we have the same fingers.... For me on all the
laptops I have seen, if the LID is actually (physically) closed, I can
not press the button. It's a design choice to not have anything powering
on the laptop when it's in your bag.

The reported state is something different, and it's software only.

Remember that the _LID acpi method corresponds to a physical state of
the laptop. So if the ACPI returned value is crap, that means user space
should either ignore it, or it should be fixed. But we can't have a
situation where we say, yes, it's reliable except few cases, without
telling which cases are unreliable.

> 
> > or teach user space about these situations.
> 
> We are trying.
> But if the end users don't buy it, no user space programs will be interested in looking into the details of these issues.
> So we need to teach end users first IMO.
> Telling them to raise the issues to the right responsibles rather than acpi button driver.

I believe they are correct raising it in acpi button. Again, kernel should
provide accurate state. It's not the case. Then, we can redirect them to
the band-aid provided by libinput.

> 
> Some end users even make reports according to some BKMs (it has been changed now after discussed on ACPI Bugzilla bug):
> https://wiki.archlinux.org/index.php/Razer#Suspend_Loop
> 
> > There is no point in assuming wrong states in the kernel "to fix user
> > space" when user space is obviously not doing the right thing.
> 
> Agreed, as we've already synchronized. :)
> 
> But the only problem is:
> We can see flooded ACPI button driver problems reported on kernel Bugzilla.
> That's why we changed the default behavior of button driver.

... and you broke thousands to fix a handful :)

> It really helped to mute all suspend/resume loop issue reports.

No. It helped you in your daily work routine. It broke thousands of
customers.

> 
> > 
> > > >
> > > > Then for such platforms, how can ACPI button driver automatically detect if an external monitor is
> > > > there?
> > 
> > That's not the ACPI button role.
> > However, user space can write to the switch to overwrite it's value when
> > it judges that the kernel is doing things wrong. Libinput is a pretty
> > good candidate, given it has a view of all input devices. And guess
> > what, the code is already there.
> 
> You only told us the way the user space should use to fix the issue.
> But didn't tell us who should be responsible of fixing this issue.

"Who" is libinput if you saw above. And if you are interested in the
actual names, that's Peter Hutterer and myself mostly.

> 
> > 
> > > > Unless we fix the BIOS code to make lid return value work as user space's expectation.
> > 
> > That would be great.
> 
> That's not the real fix.

According to the kernel policy, that's the only viable fix.

> Just another kind of work around and cannot prevent future bug reports.

Agree, that's a workaround, but you can not prevent OEMs to do crap.

> 
> > 
> > > > OK, then this creates an endless business in ACPI community to "re-develop" BIOS tables if they
> > cannot
> > > > meat user space's expectation.
> > > > That sucks.
> > 
> > Yes, but unless you teach OEM to not do crap, that's our daily burden.
> > I'd love to not have to quirk endlessly all the drivers I maintain, but
> > each generation of new devices has a new creative way of breaking the
> > existing code, "because it works under Windows".
> 
> Or we can teach user space to not do any expectation other than what Windows supports on OEMs.
> 
> > 
> > > >
> > > > It sound the best way is the user space program equipped with hwdb quirks who knows everything to
> > > > alter acpi button driver mode from user space to work around this.
> > 
> > Yes, the hwdb entry is the solution (or a quirk list in acpi/button.c).
> 
> I'd rather to see such quirks accumulated in the user side not in the button driver.
> 
> > The advantage of the hwdb entry is that it will be asynchronous from the
> > kernel releases, and users can just drop a file in their /etc folder and
> > they solved the issue. Distribution will also be able to carry the list
> > of quirked devices, and hopefully end users won't see the boot loop.
> 
> The advantage sounds good.
> 
> > 
> > > > For example:
> > > >
> > > > If hwdb is hit, and there is no external monitor, then
> > > >  Echo "open" > /sys/module/button/parameters/lid_init_state
> > > > If hwdb is not hit or there is an external monitor, then
> > > > If hwdb is hit, and there is no external monitor, then
> > > >  Echo "close" > /sys/module/button/parameters/lid_init_state
> > 
> > Hum, no. This is too late. acpi/button.c is loaded before udev hits, so
> > the initial state will already be evaluated.
> 
> But this can affect post-resume behavior.
> And the bug reports are related to the post-resume behaviors.

Nothing prevents libinput to fix that. And I think it'll fix it too,
because there is no differences between the boot false state and the
resume false state: the state is wrong, and this can be detected.

Also, nothing prevents to have the rule you wrote (overwriting
lid_init_state) to have resume work even better.

> 
> > 
> > >
> > > Let me do re-wording.
> > > If hwdb is not hit
> > >   echo "method" > /sys/module/button/parameters/lid_init_state
> > > If hwdb is hit, and there is no external monitor, then
> > >   echo "open" > /sys/module/button/parameters/lid_init_state
> > > If hwdb is hit, and there is an external monitor
> > >   echo "close" > /sys/module/button/parameters/lid_init_state
> > > Then this always assumes the hard requirements of the platform quirks.
> > > And it then looks it's better to do such switches in the user space as ACPI button driver doesn't
> > know and doesn't have to know the existence of the external monitor.
> > 
> > Again, the external monitor doesn't matter here. The external monitor
> > issue is a user space choice to:
> > - not suspend if the LID is closed and a monitor is plugged
> > - only show the greater on the internal monitor if both are turned on.
> 
> Which program turned the lid off when the lid was closed?

I am not remotely sure of what you are asking here...

> 
> > 
> > The issue we are fixing here is the fact that the switch state is wrong,
> > which makes user space assumptions wrong too (and users angry).
> 
> Considering no platform quirks.
> 
> If ACPI button driver sends SW_LID, users are likely angry.
> Unless the user space programs are changed to "ignore open event".
> 
> If ACPI button driver doesn't send switch events, but key events.
> The user space programs need to change to handle the new events.
> 
> So finally whatever we do, user space need to change a bit related to ACPI control method lid device.

Or we fix the switch to report accurate events/state.
You do realise that all the energy you are spending, answering to me,
talking to user space maintainers, users, all comes down to the fact
that you refuse to have hardware quirks?

If you don't want to write/maintain the code, fine. I can do it, but
please stop trying to make everyone else change because you just don't
want the burden of quirking a handful of laptops.

> 
> > 
> > But given that the LID switch is an actual input switch device, user
> > space can overwrite it by simply writing to the input node.
> 
> I can see what you mean here.
> You are suggesting to use input overwrite rather than changing lid_init_state to fix the issue.
> 
> This also means to me: user space is able to fix everything on its own, ACPI button driver needn't participant in.

Well, fixing it in the kernel means you don't need user space to fix it,
which is nice too on systems where there is no libinput. But on such
systems, maybe they just don't care and do something different.

Having a fix in user space has advantages too (see above), but for it to
be working, you need to stop changing the kernel behavior and the
kernel API.

> 
> > 
> > 
> > >
> > > However, is that possible to not introduce platform quirks?
> > > For example, for systemd:
> > > If it detected ACPI lid device, automatically switch to an "IgnoreLidOpenEvent" mode (like nouveau
> > drivers' ignorelid=Y option).
> > 
> > Well, given that 99.9% of laptops have this ACPI lid button, you'll just
> > remove the feature from all laptops.
> 
> No, I only removed the wrong usage models related to the strict "open" event.
> All laptops are still capable of sending correct "close" event.

My bad, I read too fast and missed the "...Open..." part of
"IgnoreLidOpenEvent".

Though I am not sure IgnoreLidOpenEvent is accurate.
"OpenEventNeverSent" seems to reflect more the reality. But again,
that's not of our (kernel developers) business.

> 
> > 
> > > BTW, which program is responsible for lighting on internal/external monitors?
> > 
> > I would say the compositor or X, so gdm, kdm, gnome, kde, etc...
> 
> So I already raised this to freedesktop:
> https://bugs.freedesktop.org/show_bug.cgi?id=100923
> But couldn't see any one there responding.

Well, Joachim answered, saying that there is likely a regression in
acpi/button.c, not in i915.

I guess you presented the problem in the wrong term: what you want is
developer to fix the acpi_lid_open() call. All you need to do is
explaining them that the call might not be accurate. And I also believe
the solution is not to tweak i915 or nouveau driver, but make sure
acpi/button does the right thing.

> 
> > 
> > > Can it determine that by its own without listening to the lid key events?
> > 
> > Basically no. This switch is there for a reason. However, I am convinced
> > that a good heuristic is to say that if you are using the internal keyboard,
> > touchscreen or touchpad, unless the user has very very thin fingers, the
> > lid is open.
> 
> I'm also convinced that the benefit of having a file in sysfs/procfs to allow user to know if the lid is open is marginal.

I am convinced I don't get your point. We are obviously not talking
about the same thing here. I was talking about the physical world, where
the user interact with the laptop...

> 
> > 
> > > This is what we preferred.
> > > If all of above usage models are corrected, we'll change acpi button driver default mode to "ignore".
> > 
> > No, we need to report accurate state, or explicitly mark the platform as
> > not reliable, and so we need "method" and a hwdb of problematic ones.
> 
> I'm actually OK with any default value.

Thanks.

> But like previously I don't want to send a patch to change the default behavior to "open".
> I don't want to be the one to change the default behavior back to "method".

I already did. So feel free to point finger at me if somebody else
complains.

> 
> > 
> > >
> > > Another problem for changing default mode back to "method" is:
> > > If we changed button driver default mode back to "method", then ACPI community will be flooded of
> > suspend/resume loop bug reports.
> > 
> > But that's your job to fix bugs. If there is a user space problem that
> > can be solved in user space, you just need to redirect the users to the
> > correct solution and close the bug.
> > 
> > But you are talking about "flood", and I don't think we ever talked
> > about more than 4 devices. So could you point me at a list of bugs that
> > you actually had to fix?
> 
> I just opened my Bugzilla filter, and copied top most lid related ones here:
> https://bugzilla.kernel.org/show_bug.cgi?id=187271
> https://bugzilla.kernel.org/show_bug.cgi?id=192231
> https://bugzilla.kernel.org/show_bug.cgi?id=191211
> https://bugzilla.kernel.org/show_bug.cgi?id=189171
> https://bugzilla.kernel.org/show_bug.cgi?id=116001

That's 3 new laptops I wasn't aware of:
- Razer Blade Stealth 2016
- SAMSUNG 530U3C/530U4C
- Razerblade 2014

The Dell one is different as it seems the EC is doing things wrong, and
do not forward the notification even if the _LID acpi method returns the
proper value.

> ...
> 
> And old reports here, occasionally opened by different users:
> https://bugzilla.kernel.org/show_bug.cgi?id=89211

MS Surface Pro 1, already known

> https://bugzilla.kernel.org/show_bug.cgi?id=106151

Samsung N210 Plus something like 8 or 10 year old laptop, already known

> https://bugzilla.kernel.org/show_bug.cgi?id=106941

Surface 3, already fixed in the kernel by myself
(drivers/platform/x86/surface3-wmi.c)

> https://bugzilla.novell.com/show_bug.cgi?id=326814

Lenovo X60, at least 5 years old one, recent models don't have any issue.

> 
> Do you think the kernel community should prepare a candidate to handle such bug flood?

Well, I don't mind receiving such "flood" in my face: 7 laptops in 10
years seems something manageable. And also looks like Razer has a common
way of being wrong, so we can just match on RazerBlade to fix all
current and future generations.

> I'm unfortunately the one changed button driver recently and have to handle them.
> That's why I changed default behavior from method to open.
> After upstream merged that default behavior change, no such reports could be seen again.

Well, that is not true. Once a change gets in linux-acpi, you need to
wait for the next merge window for it to be in linus' tree to get wider
testing. Then you need to wait for the release to have distributions
taking it and actually have broader testers and most use cases tackled.

And this corresponds to the 2 months you had between your submission and
the v4.11 release, where fedora picked it in Fedora 26, and where our
users started testing it intensively.

> 
> But it's just 1-2 months Bugzilla silence before seeing a different bug flood trend:
> This time, they are related to the external monitor usage model:
> https://bugzilla.kernel.org/show_bug.cgi?id=187271
> https://bugzilla.redhat.com/show_bug.cgi?id=1430259
> https://bugzilla.kernel.org/show_bug.cgi?id=195455
> Now that you can see why I didn't send a patch to change the default behavior to "open" at the time we were discussing.
> That's also why I think we needn't revert back to "method" as the default behavior.

Still, I strongly disagree. We do not fix 7 devices by breaking
hundreds. That's just not how the kernel work.

> 
> I'm afraid you may need to be prepared to handle some of the reports if you revert the default behavior back to "method".
> So if you insist, I can put my Acked-by to your PATCH 2/2.

I insist, and I am not afraid. I'd rather be facing those than having
someone else breaking a single switch in the system in a way we can't do
anything else to solve the issue in user space.

> 
> > 
> > > After we root cause that's not a kernel problem, do we have mean in other community to handle such
> > reports?
> > > For example, to collect hwdb entries.
> > 
> > libinput, systemd are good candidate.
> > Libinput already has the bits in place, so I'd say we should probably
> > ask the users to report a bug on the wayland/libinput component of
> > bugs.freedesktop.org. But this will only work if the default initial lid
> > state is "method".
> 
> Good to know. :)
> I've just changed the category of the forwarded report.

Well, I am not sure your bug report should have been changed. The bug
report was regarding i915 being confident in the _LID acpi call, and
that is not something libinput can change.

> 
> > 
> > Sorry for showing I am angry. But I thought we solved this months ago
> > and this bites back.
> 
> You always help me a lot, appreciated!
> 

Thanks for your kindness :)

Would you agree on:
- acking both my patches so Rafael can take them ans stable can apply
  them ASAP
- give me maintainership of drivers/acpi/button.c (ACK a following patch
  where I add me to MAINTAINERS)
- redirect all kernel bugs to me related to LID switch (or having me the
  default assignee for a new acpi/button component)
- allow me to add a list of quirk in acpi/button.c if I judge this as
  fit

If you agree on that, that would be good for both our sake I think.

Cheers,
Benjamin

> Cheers
> Lv
> 
> > 
> > Cheers,
> > Benjamin
> > 
> > >
> > > Thanks and best regards
> > > Lv
> > >
> > > >
> > > > So PATCH 2 is not useful.
> > > > Reverting that can trigger a regression loop we surely do not want to handle.
> > > >
> > > > Thanks and best regards
> > > > Lv
> > > >
> > > > >
> > > > > Cc: stable@vger.kernel.org
> > > > > Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
> > > > > ---
> > > > >  Documentation/acpi/acpi-lid.txt | 16 ++++++++++++----
> > > > >  drivers/acpi/button.c           |  9 +++++++++
> > > > >  2 files changed, 21 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/Documentation/acpi/acpi-lid.txt b/Documentation/acpi/acpi-lid.txt
> > > > > index 22cb309..effe7af 100644
> > > > > --- a/Documentation/acpi/acpi-lid.txt
> > > > > +++ b/Documentation/acpi/acpi-lid.txt
> > > > > @@ -59,20 +59,28 @@ button driver uses the following 3 modes in order not to trigger issues.
> > > > >  If the userspace hasn't been prepared to ignore the unreliable "opened"
> > > > >  events and the unreliable initial state notification, Linux users can use
> > > > >  the following kernel parameters to handle the possible issues:
> > > > > -A. button.lid_init_state=open:
> > > > > +A. button.lid_init_state=method:
> > > > > +   When this option is specified, the ACPI button driver reports the
> > > > > +   initial lid state using the returning value of the _LID control method
> > > > > +   and whether the "opened"/"closed" events are paired fully relies on the
> > > > > +   firmware implementation.
> > > > > +   This option can be used to fix some platforms where the returning value
> > > > > +   of the _LID control method is reliable but the initial lid state
> > > > > +   notification is missing.
> > > > > +   This option is the default behavior during the period the userspace
> > > > > +   isn't ready to handle the buggy AML tables.
> > > > > +B. button.lid_init_state=open:
> > > > >     When this option is specified, the ACPI button driver always reports the
> > > > >     initial lid state as "opened" and whether the "opened"/"closed" events
> > > > >     are paired fully relies on the firmware implementation.
> > > > >     This may fix some platforms where the returning value of the _LID
> > > > >     control method is not reliable and the initial lid state notification is
> > > > >     missing.
> > > > > -   This option is the default behavior during the period the userspace
> > > > > -   isn't ready to handle the buggy AML tables.
> > > > >
> > > > >  If the userspace has been prepared to ignore the unreliable "opened" events
> > > > >  and the unreliable initial state notification, Linux users should always
> > > > >  use the following kernel parameter:
> > > > > -B. button.lid_init_state=ignore:
> > > > > +C. button.lid_init_state=ignore:
> > > > >     When this option is specified, the ACPI button driver never reports the
> > > > >     initial lid state and there is a compensation mechanism implemented to
> > > > >     ensure that the reliable "closed" notifications can always be delievered
> > > > > diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
> > > > > index 668137e..6d5a8c1 100644
> > > > > --- a/drivers/acpi/button.c
> > > > > +++ b/drivers/acpi/button.c
> > > > > @@ -57,6 +57,7 @@
> > > > >
> > > > >  #define ACPI_BUTTON_LID_INIT_IGNORE	0x00
> > > > >  #define ACPI_BUTTON_LID_INIT_OPEN	0x01
> > > > > +#define ACPI_BUTTON_LID_INIT_METHOD	0x02
> > > > >
> > > > >  #define _COMPONENT		ACPI_BUTTON_COMPONENT
> > > > >  ACPI_MODULE_NAME("button");
> > > > > @@ -376,6 +377,9 @@ static void acpi_lid_initialize_state(struct acpi_device *device)
> > > > >  	case ACPI_BUTTON_LID_INIT_OPEN:
> > > > >  		(void)acpi_lid_notify_state(device, 1);
> > > > >  		break;
> > > > > +	case ACPI_BUTTON_LID_INIT_METHOD:
> > > > > +		(void)acpi_lid_update_state(device);
> > > > > +		break;
> > > > >  	case ACPI_BUTTON_LID_INIT_IGNORE:
> > > > >  	default:
> > > > >  		break;
> > > > > @@ -559,6 +563,9 @@ static int param_set_lid_init_state(const char *val, struct kernel_param *kp)
> > > > >  	if (!strncmp(val, "open", sizeof("open") - 1)) {
> > > > >  		lid_init_state = ACPI_BUTTON_LID_INIT_OPEN;
> > > > >  		pr_info("Notify initial lid state as open\n");
> > > > > +	} else if (!strncmp(val, "method", sizeof("method") - 1)) {
> > > > > +		lid_init_state = ACPI_BUTTON_LID_INIT_METHOD;
> > > > > +		pr_info("Notify initial lid state with _LID return value\n");
> > > > >  	} else if (!strncmp(val, "ignore", sizeof("ignore") - 1)) {
> > > > >  		lid_init_state = ACPI_BUTTON_LID_INIT_IGNORE;
> > > > >  		pr_info("Do not notify initial lid state\n");
> > > > > @@ -572,6 +579,8 @@ static int param_get_lid_init_state(char *buffer, struct kernel_param *kp)
> > > > >  	switch (lid_init_state) {
> > > > >  	case ACPI_BUTTON_LID_INIT_OPEN:
> > > > >  		return sprintf(buffer, "open");
> > > > > +	case ACPI_BUTTON_LID_INIT_METHOD:
> > > > > +		return sprintf(buffer, "method");
> > > > >  	case ACPI_BUTTON_LID_INIT_IGNORE:
> > > > >  		return sprintf(buffer, "ignore");
> > > > >  	default:
> > > > > --
> > > > > 2.9.3
> > > >
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-12  2:36       ` Zheng, Lv
@ 2017-05-12 21:50         ` Rafael J. Wysocki
  2017-05-15  7:45           ` Benjamin Tissoires
  0 siblings, 1 reply; 38+ messages in thread
From: Rafael J. Wysocki @ 2017-05-12 21:50 UTC (permalink / raw)
  To: Zheng, Lv, Benjamin Tissoires; +Cc: Jiri Eischmann, linux-acpi, linux-kernel

On Friday, May 12, 2017 02:36:20 AM Zheng, Lv wrote:
> Hi, 
> 
> > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> > 
> > On May 11 2017 or thereabouts, Zheng, Lv wrote:
> > > Hi,
> > >
> > > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > > > Subject: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> > > >
> > > > This reverts commit 77e9a4aa9de10cc1418bf9a892366988802a8025.
> > > >
> > > > Even if the method implementation can be buggy on some platform,
> > > > the "open" choice is worse. It breaks docking stations basically
> > > > and there is no way to have a user-space hwdb to fix that.
> > > >
> > > > On the contrary, it's rather easy in user-space to have a hwdb
> > > > with the problematic platforms. Then, libinput (1.7.0+) can fix
> > > > the state of the LID switch for us: you need to set the udev
> > > > property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
> > > >
> > > > When libinput detects internal keyboard events, it will
> > > > overwrite the state of the switch to open, making it reliable
> > > > again. Given that logind only checks the LID switch value after
> > > > a timeout, we can assume the user will use the internal keyboard
> > > > before this timeout expires.
> > > >
> > > > For example, such a hwdb entry is:
> > > >
> > > > libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
> > > >  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
> > >
> > > For the reason mentioned previously and proofs here (see patch descriptions):
> > > https://patchwork.kernel.org/patch/9717111/
> > 
> > I am not sure you can call this proofs. The "close" state is IMO the
> > exact same as the "method" one, it's just that the intel driver
> > triggers the evaluation of the _LID method, not acpi/button.c.
> 
> I should correct you that:
> 1. intel_lvds driver only evaluates _LID method for its own purpose,
>    See my reply to PATCH 4-5;
> 2. acpi/button.c is still responsible for generating the lid input event (to input layer and thus the userspace)
>    And the input event generated by button.c differs for the 2 modes.
>    See my another reply to PATCH 02.
> 
> > 
> > And remember that this new default prevents userspace to fix it because
> > it's rather easy to detect that the device is actually opened (input
> > events coming from interanl keyboard, touchscreen, or touchpad), while
> > reporting the lid switch as open means we can't know if the user is
> > simply not using the internal devices, or if we have just a wrong state.
> 
> That depends on what the meaning of "fix" is IMO.
> I saw you wrote a lot in another message, let's discuss this there.
> 
> > 
> > Given that this patch breaks all Lenovos with a dock (I can guess that 6
> > lines this year are using docks, and within each line you have 2-5
> > variants), I'd suggest we do not break those existing laptops and just
> > revert this patch.
> > 
> > I would think other OEMs have a docking station with an actual power
> > button but I can't be sure by looking at the product pages.
> 
> I'm not sure if it breaks the external monitor usage model.

Well, if it worked in a specific way that users depended on before the commit in
question and now it works differently, then it does break things.

Benjamin, my understanding is that this is the case, is it correct?

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
  2017-05-12  9:50           ` Benjamin Tissoires
@ 2017-05-15  4:54               ` Zheng, Lv
  0 siblings, 0 replies; 38+ messages in thread
From: Zheng, Lv @ 2017-05-15  4:54 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Rafael J . Wysocki, Jiri Eischmann, linux-acpi, linux-kernel

Hi, Benjamin

Let's stop endless discussing and focus on our needs.

I just copied my questions here.
You can ask them directly.
For the below inlined replies, I'll stop replying if they are based on dependent on our basic agreements.
And I'll reply if something is really bad from my point of view.

My point of view is:
There is a gap between (BIOS ensured/Windows expected) acpi control method lid device behavior and Linux user space expected acpi control method lid device behavior.
Button driver default behavior should be: button.lid_init_state=ignore
If user space programs have special needs, they can fix problems on their own, via the following mean:
 echo -n "open" > /sys/modules/button/parameters/lid_init_state
 echo -n "close" > /sys/modules/button/parameters/lid_init_state
Keeping open/close modes is because I don't think there is any bug in button driver.
So I need to prepare quirk modes from button driver's point of view and use them as a response to related bug reports reported in acpi community.
Your point of view is:
There is a gap between (BIOS ensured/Windows expected) acpi control method lid device behavior and Linux user space expected acpi control method lid device behavior.
Button driver default behavior should be (not 100% sure if this is your opinion): button.lid_init_state=method
If user space programs have special needs, they can fix them on their own, via the following mean:
 libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
From this point of view, we actually don't need open/close modes.

It seems we just need to determine the following first:
1. Who should be responsible for solving bugs triggered by the conflict between bios and linux user space expectations:
   button driver? libinput? Some other user space programs? Users?
2. What should be the default button driver behavior?
   button.lid_init_state=ignore? button.lid_init_state=method?
3. If non button driver quirks are working, button driver quirk modes are useless.
   The question is: Should button.lid_init_state=open/close be kept?
4. From button driver's point of view, button.lid_init_state=ignore seems to be always correct, so we won't abandon it.
   If we can use libinput to manage platform quirks, then button.lid_init_state=method also looks useless.
   The question is: Should button.lid_init_state=method be kept?

I should also let you know my preference:
1. using button.lid_init_state=ignore and button.lid_init_state=method as default behavior is ok for me if answer to 1 is not button driver, otherwise using button.lid_init_state=method is not ok for me
2. deletion of button.lid_init_state=open/close is ok for me if answer to 1 is not button driver, otherwise deletion of button.lid_init_state=open/close is not ok for me
3. deletion of button.lid_init_state=method is always ok for me

See the base line from my side is very clear:
If acpi community need to handle such bug reports, button.lid_init_state=method cannot be the default behavior.
We are just using a different default behavior than "method" to drive things to reach the final root caused solution.

Could you let me know your preference so that we can figure out an agreement between us.
Though I don't know if end users will buy it (they may keep on filing regression reports in ACPI community).

> From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> Subject: Re: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"

> ...
Skip as it looks the result of debating here can also be concluded from the answers of the questions.
Let's just stop arguing same thing again and again using different contexts.

> > > make reasonable assumptions based on the exact
> > > model capabilities (is the power button accessible with the LID closed),
> >
> > Sounds it's always accessible.
> 
> hmm, I am not sure we have the same fingers.... For me on all the
> laptops I have seen, if the LID is actually (physically) closed, I can
> not press the button. It's a design choice to not have anything powering
> on the laptop when it's in your bag.

Wow...
I can see that recent laptops are having their power buttons on the edge, not on the keypad.
For recent laptop/tablet 2-in-1 PCs or touch laptops which can have their keyboards folded and act notebooks, the power buttons are obviously not on the detachable keypads.
Also for the laptops supporting external monitors, obviously allow users to push power buttons while the lid is closed.

> ...
Skip as it looks the result of debating here can also be concluded from the answers of the questions.


> > > The issue we are fixing here is the fact that the switch state is wrong,
> > > which makes user space assumptions wrong too (and users angry).
> >
> > Considering no platform quirks.
> >
> > If ACPI button driver sends SW_LID, users are likely angry.
> > Unless the user space programs are changed to "ignore open event".
> >
> > If ACPI button driver doesn't send switch events, but key events.
> > The user space programs need to change to handle the new events.
> >
> > So finally whatever we do, user space need to change a bit related to ACPI control method lid device.
> 
> Or we fix the switch to report accurate events/state.
> You do realise that all the energy you are spending, answering to me,
> talking to user space maintainers, users, all comes down to the fact
> that you refuse to have hardware quirks?
> 
> If you don't want to write/maintain the code, fine. I can do it, but
> please stop trying to make everyone else change because you just don't
> want the burden of quirking a handful of laptops.

But I have a very interesting question to ask:
Can button driver/libinput actually fix this without modifying BIOS:

It seems we both rely on user's decision to use proper quirk modes for a specific usage model.
For example, for this case:
 A BIOS never sends lid notification after boot/resume, and _LID return value is close after boot/resume.
Let's see which quirk mode should the user choose before suspending:
1. with no external monitor attached, user should:
    write_open/button.lid_init_state=open
2. with external monitor attached, if user wants the lid to be lit on
    write_open/button.lid_init_state=open
3. with external monitor attached, if user wants the lid not to be lit on
    write_close/button.lid_init_state=close
See there is no single default value for all usage models.

Thus there is no possible __FIX__ for acpi button driver and libinput.
While user space programs can just fix their usage models.

So finally we actually cannot fix anything by maintaining the quirk table, but just create a business of maintaining the quirk table.

> ...
Skip as it looks the result of debating here can also be concluded from the answers of the questions.

> > > > However, is that possible to not introduce platform quirks?
> > > > For example, for systemd:
> > > > If it detected ACPI lid device, automatically switch to an "IgnoreLidOpenEvent" mode (like
> nouveau
> > > drivers' ignorelid=Y option).
> > >
> > > Well, given that 99.9% of laptops have this ACPI lid button, you'll just
> > > remove the feature from all laptops.
> >
> > No, I only removed the wrong usage models related to the strict "open" event.
> > All laptops are still capable of sending correct "close" event.
> 
> My bad, I read too fast and missed the "...Open..." part of
> "IgnoreLidOpenEvent".
> 
> Though I am not sure IgnoreLidOpenEvent is accurate.
> "OpenEventNeverSent" seems to reflect more the reality. But again,
> that's not of our (kernel developers) business.

IMO this is the only root cause fix. :)
It's the only way that the user can use without changing its quirk modes for different usage models.

> > So I already raised this to freedesktop:
> > https://bugs.freedesktop.org/show_bug.cgi?id=100923
> > But couldn't see any one there responding.
> 
> Well, Joachim answered, saying that there is likely a regression in
> acpi/button.c, not in i915.

He surely can add more details as he is responsible of triage of the referenced bug reported on redhat.
But there is no agreement reached yet.

> > > > Can it determine that by its own without listening to the lid key events?
> > >
> > > Basically no. This switch is there for a reason. However, I am convinced
> > > that a good heuristic is to say that if you are using the internal keyboard,
> > > touchscreen or touchpad, unless the user has very very thin fingers, the
> > > lid is open.
> >
> > I'm also convinced that the benefit of having a file in sysfs/procfs to allow user to know if the
> lid is open is marginal.
> 
> I am convinced I don't get your point. We are obviously not talking
> about the same thing here. I was talking about the physical world, where
> the user interact with the laptop...

See my previous reply against accessing power buttons when lid is closed.

> ...
Skip as it looks the result of debating here can also be concluded from the answers of the questions.

> > But it's just 1-2 months Bugzilla silence before seeing a different bug flood trend:
> > This time, they are related to the external monitor usage model:
> > https://bugzilla.kernel.org/show_bug.cgi?id=187271
> > https://bugzilla.redhat.com/show_bug.cgi?id=1430259
> > https://bugzilla.kernel.org/show_bug.cgi?id=195455
> > Now that you can see why I didn't send a patch to change the default behavior to "open" at the time
> we were discussing.
> > That's also why I think we needn't revert back to "method" as the default behavior.
> 
> Still, I strongly disagree. We do not fix 7 devices by breaking
> hundreds. That's just not how the kernel work.

No. The word of "broken" is entirely wrong and emotional here.
If both user space and kernel space are changed to act according to button.lid_init_state=ignore.
No one will be broken.
And no future laptops will be broken.

> ...
Skip as it looks the result of debating here can also be concluded from the answers of the questions.

> > > > After we root cause that's not a kernel problem, do we have mean in other community to handle
> such
> > > reports?
> > > > For example, to collect hwdb entries.
> > >
> > > libinput, systemd are good candidate.
> > > Libinput already has the bits in place, so I'd say we should probably
> > > ask the users to report a bug on the wayland/libinput component of
> > > bugs.freedesktop.org. But this will only work if the default initial lid
> > > state is "method".
> >
> > Good to know. :)
> > I've just changed the category of the forwarded report.
> 
> Well, I am not sure your bug report should have been changed. The bug
> report was regarding i915 being confident in the _LID acpi call, and
> that is not something libinput can change.

OK, I see.

> ...
Skip as it looks the result of debating here can also be concluded from the answers of the questions.

Cheers,
Lv

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
@ 2017-05-15  4:54               ` Zheng, Lv
  0 siblings, 0 replies; 38+ messages in thread
From: Zheng, Lv @ 2017-05-15  4:54 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Rafael J . Wysocki, Jiri Eischmann, linux-acpi, linux-kernel

Hi, Benjamin

Let's stop endless discussing and focus on our needs.

I just copied my questions here.
You can ask them directly.
For the below inlined replies, I'll stop replying if they are based on dependent on our basic agreements.
And I'll reply if something is really bad from my point of view.

My point of view is:
There is a gap between (BIOS ensured/Windows expected) acpi control method lid device behavior and Linux user space expected acpi control method lid device behavior.
Button driver default behavior should be: button.lid_init_state=ignore
If user space programs have special needs, they can fix problems on their own, via the following mean:
 echo -n "open" > /sys/modules/button/parameters/lid_init_state
 echo -n "close" > /sys/modules/button/parameters/lid_init_state
Keeping open/close modes is because I don't think there is any bug in button driver.
So I need to prepare quirk modes from button driver's point of view and use them as a response to related bug reports reported in acpi community.
Your point of view is:
There is a gap between (BIOS ensured/Windows expected) acpi control method lid device behavior and Linux user space expected acpi control method lid device behavior.
Button driver default behavior should be (not 100% sure if this is your opinion): button.lid_init_state=method
If user space programs have special needs, they can fix them on their own, via the following mean:
 libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
>From this point of view, we actually don't need open/close modes.

It seems we just need to determine the following first:
1. Who should be responsible for solving bugs triggered by the conflict between bios and linux user space expectations:
   button driver? libinput? Some other user space programs? Users?
2. What should be the default button driver behavior?
   button.lid_init_state=ignore? button.lid_init_state=method?
3. If non button driver quirks are working, button driver quirk modes are useless.
   The question is: Should button.lid_init_state=open/close be kept?
4. From button driver's point of view, button.lid_init_state=ignore seems to be always correct, so we won't abandon it.
   If we can use libinput to manage platform quirks, then button.lid_init_state=method also looks useless.
   The question is: Should button.lid_init_state=method be kept?

I should also let you know my preference:
1. using button.lid_init_state=ignore and button.lid_init_state=method as default behavior is ok for me if answer to 1 is not button driver, otherwise using button.lid_init_state=method is not ok for me
2. deletion of button.lid_init_state=open/close is ok for me if answer to 1 is not button driver, otherwise deletion of button.lid_init_state=open/close is not ok for me
3. deletion of button.lid_init_state=method is always ok for me

See the base line from my side is very clear:
If acpi community need to handle such bug reports, button.lid_init_state=method cannot be the default behavior.
We are just using a different default behavior than "method" to drive things to reach the final root caused solution.

Could you let me know your preference so that we can figure out an agreement between us.
Though I don't know if end users will buy it (they may keep on filing regression reports in ACPI community).

> From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> Subject: Re: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"

> ...
Skip as it looks the result of debating here can also be concluded from the answers of the questions.
Let's just stop arguing same thing again and again using different contexts.

> > > make reasonable assumptions based on the exact
> > > model capabilities (is the power button accessible with the LID closed),
> >
> > Sounds it's always accessible.
> 
> hmm, I am not sure we have the same fingers.... For me on all the
> laptops I have seen, if the LID is actually (physically) closed, I can
> not press the button. It's a design choice to not have anything powering
> on the laptop when it's in your bag.

Wow...
I can see that recent laptops are having their power buttons on the edge, not on the keypad.
For recent laptop/tablet 2-in-1 PCs or touch laptops which can have their keyboards folded and act notebooks, the power buttons are obviously not on the detachable keypads.
Also for the laptops supporting external monitors, obviously allow users to push power buttons while the lid is closed.

> ...
Skip as it looks the result of debating here can also be concluded from the answers of the questions.


> > > The issue we are fixing here is the fact that the switch state is wrong,
> > > which makes user space assumptions wrong too (and users angry).
> >
> > Considering no platform quirks.
> >
> > If ACPI button driver sends SW_LID, users are likely angry.
> > Unless the user space programs are changed to "ignore open event".
> >
> > If ACPI button driver doesn't send switch events, but key events.
> > The user space programs need to change to handle the new events.
> >
> > So finally whatever we do, user space need to change a bit related to ACPI control method lid device.
> 
> Or we fix the switch to report accurate events/state.
> You do realise that all the energy you are spending, answering to me,
> talking to user space maintainers, users, all comes down to the fact
> that you refuse to have hardware quirks?
> 
> If you don't want to write/maintain the code, fine. I can do it, but
> please stop trying to make everyone else change because you just don't
> want the burden of quirking a handful of laptops.

But I have a very interesting question to ask:
Can button driver/libinput actually fix this without modifying BIOS:

It seems we both rely on user's decision to use proper quirk modes for a specific usage model.
For example, for this case:
 A BIOS never sends lid notification after boot/resume, and _LID return value is close after boot/resume.
Let's see which quirk mode should the user choose before suspending:
1. with no external monitor attached, user should:
    write_open/button.lid_init_state=open
2. with external monitor attached, if user wants the lid to be lit on
    write_open/button.lid_init_state=open
3. with external monitor attached, if user wants the lid not to be lit on
    write_close/button.lid_init_state=close
See there is no single default value for all usage models.

Thus there is no possible __FIX__ for acpi button driver and libinput.
While user space programs can just fix their usage models.

So finally we actually cannot fix anything by maintaining the quirk table, but just create a business of maintaining the quirk table.

> ...
Skip as it looks the result of debating here can also be concluded from the answers of the questions.

> > > > However, is that possible to not introduce platform quirks?
> > > > For example, for systemd:
> > > > If it detected ACPI lid device, automatically switch to an "IgnoreLidOpenEvent" mode (like
> nouveau
> > > drivers' ignorelid=Y option).
> > >
> > > Well, given that 99.9% of laptops have this ACPI lid button, you'll just
> > > remove the feature from all laptops.
> >
> > No, I only removed the wrong usage models related to the strict "open" event.
> > All laptops are still capable of sending correct "close" event.
> 
> My bad, I read too fast and missed the "...Open..." part of
> "IgnoreLidOpenEvent".
> 
> Though I am not sure IgnoreLidOpenEvent is accurate.
> "OpenEventNeverSent" seems to reflect more the reality. But again,
> that's not of our (kernel developers) business.

IMO this is the only root cause fix. :)
It's the only way that the user can use without changing its quirk modes for different usage models.

> > So I already raised this to freedesktop:
> > https://bugs.freedesktop.org/show_bug.cgi?id=100923
> > But couldn't see any one there responding.
> 
> Well, Joachim answered, saying that there is likely a regression in
> acpi/button.c, not in i915.

He surely can add more details as he is responsible of triage of the referenced bug reported on redhat.
But there is no agreement reached yet.

> > > > Can it determine that by its own without listening to the lid key events?
> > >
> > > Basically no. This switch is there for a reason. However, I am convinced
> > > that a good heuristic is to say that if you are using the internal keyboard,
> > > touchscreen or touchpad, unless the user has very very thin fingers, the
> > > lid is open.
> >
> > I'm also convinced that the benefit of having a file in sysfs/procfs to allow user to know if the
> lid is open is marginal.
> 
> I am convinced I don't get your point. We are obviously not talking
> about the same thing here. I was talking about the physical world, where
> the user interact with the laptop...

See my previous reply against accessing power buttons when lid is closed.

> ...
Skip as it looks the result of debating here can also be concluded from the answers of the questions.

> > But it's just 1-2 months Bugzilla silence before seeing a different bug flood trend:
> > This time, they are related to the external monitor usage model:
> > https://bugzilla.kernel.org/show_bug.cgi?id=187271
> > https://bugzilla.redhat.com/show_bug.cgi?id=1430259
> > https://bugzilla.kernel.org/show_bug.cgi?id=195455
> > Now that you can see why I didn't send a patch to change the default behavior to "open" at the time
> we were discussing.
> > That's also why I think we needn't revert back to "method" as the default behavior.
> 
> Still, I strongly disagree. We do not fix 7 devices by breaking
> hundreds. That's just not how the kernel work.

No. The word of "broken" is entirely wrong and emotional here.
If both user space and kernel space are changed to act according to button.lid_init_state=ignore.
No one will be broken.
And no future laptops will be broken.

> ...
Skip as it looks the result of debating here can also be concluded from the answers of the questions.

> > > > After we root cause that's not a kernel problem, do we have mean in other community to handle
> such
> > > reports?
> > > > For example, to collect hwdb entries.
> > >
> > > libinput, systemd are good candidate.
> > > Libinput already has the bits in place, so I'd say we should probably
> > > ask the users to report a bug on the wayland/libinput component of
> > > bugs.freedesktop.org. But this will only work if the default initial lid
> > > state is "method".
> >
> > Good to know. :)
> > I've just changed the category of the forwarded report.
> 
> Well, I am not sure your bug report should have been changed. The bug
> report was regarding i915 being confident in the _LID acpi call, and
> that is not something libinput can change.

OK, I see.

> ...
Skip as it looks the result of debating here can also be concluded from the answers of the questions.

Cheers,
Lv

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
  2017-05-15  4:54               ` Zheng, Lv
  (?)
@ 2017-05-15  7:42               ` Benjamin Tissoires
  2017-05-16  5:05                 ` Zheng, Lv
  -1 siblings, 1 reply; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-15  7:42 UTC (permalink / raw)
  To: Zheng, Lv; +Cc: Rafael J . Wysocki, Jiri Eischmann, linux-acpi, linux-kernel

Hi Lv,

On May 15 2017 or thereabouts, Zheng, Lv wrote:
> Hi, Benjamin
> 
> Let's stop endless discussing and focus on our needs.
> 
> I just copied my questions here.
> You can ask them directly.
> For the below inlined replies, I'll stop replying if they are based on dependent on our basic agreements.
> And I'll reply if something is really bad from my point of view.
> 
> My point of view is:
> There is a gap between (BIOS ensured/Windows expected) acpi control method lid device behavior and Linux user space expected acpi control method lid device behavior.
> Button driver default behavior should be: button.lid_init_state=ignore
> If user space programs have special needs, they can fix problems on their own, via the following mean:
>  echo -n "open" > /sys/modules/button/parameters/lid_init_state
>  echo -n "close" > /sys/modules/button/parameters/lid_init_state
> Keeping open/close modes is because I don't think there is any bug in button driver.
> So I need to prepare quirk modes from button driver's point of view and use them as a response to related bug reports reported in acpi community.
> Your point of view is:
> There is a gap between (BIOS ensured/Windows expected) acpi control method lid device behavior and Linux user space expected acpi control method lid device behavior.

Yep. But our role, as kernel developers, is to not break user space even
if they made wrong design choices.

> Button driver default behavior should be (not 100% sure if this is your opinion): button.lid_init_state=method

Yes, I'd like to revert to the old behavior (see below for a rationale).

> If user space programs have special needs, they can fix them on their own, via the following mean:
>  libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
>   LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
> From this point of view, we actually don't need open/close modes.

We don't need close, but we need to keep open, for 2 reasons I'll detail
below.

> 
> It seems we just need to determine the following first:
> 1. Who should be responsible for solving bugs triggered by the conflict between bios and linux user space expectations:
>    button driver? libinput? Some other user space programs? Users?

Hopefully libinput or systemd (through a udev rule). If things gets
worse a acpi/button quirk might be used, but in a second time.

> 2. What should be the default button driver behavior?
>    button.lid_init_state=ignore? button.lid_init_state=method?

button.lid_init_state=method:
- this was the old default state, and switching to something else
  creates regression
- the lid switch input node is an EV_SWITCH. And this has a state we
  need to synchronise with the hardware first. The default EV_SW state
  is closed, so if we do not sync it first, we might report wrong state
  to user space.

> 3. If non button driver quirks are working, button driver quirk modes are useless.
>    The question is: Should button.lid_init_state=open/close be kept?

We should keep button.lid_init_state=open:
- it's kernel API now, so you can't remove it without breaking users
  configuration
- it helps booting laptops that are not known to be working, the time
  the user installs the distribution and add the hwdb entry to fix it.

we should not keep button.lid_init_state=close:
- I don't see a good use case where it is needed, besides debugging
  drivers that should be debugged in an other way.

> 4. From button driver's point of view, button.lid_init_state=ignore seems to be always correct, so we won't abandon it.

We can keep it, and we should, it's kernel API

>    If we can use libinput to manage platform quirks, then button.lid_init_state=method also looks useless.

No. 'method' is the only way to guarantee the exported input node is
synced properly with the actual physical world.

>    The question is: Should button.lid_init_state=method be kept?

Definitively.

> 
> I should also let you know my preference:
> 1. using button.lid_init_state=ignore and button.lid_init_state=method as default behavior is ok for me if answer to 1 is not button driver, otherwise using button.lid_init_state=method is not ok for me
> 2. deletion of button.lid_init_state=open/close is ok for me if answer to 1 is not button driver, otherwise deletion of button.lid_init_state=open/close is not ok for me
> 3. deletion of button.lid_init_state=method is always ok for me
> 
> See the base line from my side is very clear:
> If acpi community need to handle such bug reports, button.lid_init_state=method cannot be the default behavior.

I already proposed to handle those reports. I don't see why you are
concerned about those future reports.

> We are just using a different default behavior than "method" to drive things to reach the final root caused solution.
> 
> Could you let me know your preference so that we can figure out an agreement between us.
> Though I don't know if end users will buy it (they may keep on filing regression reports in ACPI community).

Switching back to button.lid_init_state=method will prevent regressions
to be filled. If a user buys a new machine and the LID switch doesn't
work as expected, it's not a regression, it's a bug. Regression is
something we absolutely need to avoid, and switching to
button.lid_init_state=open introduced far too many regressions on
systems.

> 
> > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > Subject: Re: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
> 
> > ...
> Skip as it looks the result of debating here can also be concluded from the answers of the questions.
> Let's just stop arguing same thing again and again using different contexts.
> 
> > > > make reasonable assumptions based on the exact
> > > > model capabilities (is the power button accessible with the LID closed),
> > >
> > > Sounds it's always accessible.
> > 
> > hmm, I am not sure we have the same fingers.... For me on all the
> > laptops I have seen, if the LID is actually (physically) closed, I can
> > not press the button. It's a design choice to not have anything powering
> > on the laptop when it's in your bag.
> 
> Wow...
> I can see that recent laptops are having their power buttons on the edge, not on the keypad.
> For recent laptop/tablet 2-in-1 PCs or touch laptops which can have their keyboards folded and act notebooks, the power buttons are obviously not on the detachable keypads.

Right. I forgot about this new class of tablets/convertible. I was
probably too focused on the professional laptops.

> Also for the laptops supporting external monitors, obviously allow users to push power buttons while the lid is closed.

Well, on the Thinkpads (and Fujitsu from what I can tell), this
situation can happen because there is a physical power button on the
docking station itself.

> 
> > ...
> Skip as it looks the result of debating here can also be concluded from the answers of the questions.
> 
> 
> > > > The issue we are fixing here is the fact that the switch state is wrong,
> > > > which makes user space assumptions wrong too (and users angry).
> > >
> > > Considering no platform quirks.
> > >
> > > If ACPI button driver sends SW_LID, users are likely angry.
> > > Unless the user space programs are changed to "ignore open event".
> > >
> > > If ACPI button driver doesn't send switch events, but key events.
> > > The user space programs need to change to handle the new events.
> > >
> > > So finally whatever we do, user space need to change a bit related to ACPI control method lid device.
> > 
> > Or we fix the switch to report accurate events/state.
> > You do realise that all the energy you are spending, answering to me,
> > talking to user space maintainers, users, all comes down to the fact
> > that you refuse to have hardware quirks?
> > 
> > If you don't want to write/maintain the code, fine. I can do it, but
> > please stop trying to make everyone else change because you just don't
> > want the burden of quirking a handful of laptops.
> 
> But I have a very interesting question to ask:
> Can button driver/libinput actually fix this without modifying BIOS:
> 
> It seems we both rely on user's decision to use proper quirk modes for a specific usage model.
> For example, for this case:
>  A BIOS never sends lid notification after boot/resume, and _LID return value is close after boot/resume.

It used to work for years (_LID return value is correct on most laptops
after boot/resume).

> Let's see which quirk mode should the user choose before suspending:
> 1. with no external monitor attached, user should:
>     write_open/button.lid_init_state=open
> 2. with external monitor attached, if user wants the lid to be lit on
>     write_open/button.lid_init_state=open
> 3. with external monitor attached, if user wants the lid not to be lit on
>     write_close/button.lid_init_state=close
> See there is no single default value for all usage models.

No. Users shouldn't interact with kernel parameters. The kernel
parameter can help them to boot the system if the user space is not able
to fix the situation. But in any case, end users should not have to deal
with kernel parameters.

> 
> Thus there is no possible __FIX__ for acpi button driver and libinput.

I never talked about a fix. I know the situation is unsolvable, which is
why I talked about quirks or workarounds.

> While user space programs can just fix their usage models.

You can't expect user space to change anything from the kernel point of
view without a long enough warning.

> 
> So finally we actually cannot fix anything by maintaining the quirk table, but just create a business of maintaining the quirk table.

Yes, but I never asked you to maintain such table. I am volunteering to
do that if needed.

> 
> > ...
> Skip as it looks the result of debating here can also be concluded from the answers of the questions.
> 
> > > > > However, is that possible to not introduce platform quirks?
> > > > > For example, for systemd:
> > > > > If it detected ACPI lid device, automatically switch to an "IgnoreLidOpenEvent" mode (like
> > nouveau
> > > > drivers' ignorelid=Y option).
> > > >
> > > > Well, given that 99.9% of laptops have this ACPI lid button, you'll just
> > > > remove the feature from all laptops.
> > >
> > > No, I only removed the wrong usage models related to the strict "open" event.
> > > All laptops are still capable of sending correct "close" event.
> > 
> > My bad, I read too fast and missed the "...Open..." part of
> > "IgnoreLidOpenEvent".
> > 
> > Though I am not sure IgnoreLidOpenEvent is accurate.
> > "OpenEventNeverSent" seems to reflect more the reality. But again,
> > that's not of our (kernel developers) business.
> 
> IMO this is the only root cause fix. :)
> It's the only way that the user can use without changing its quirk modes for different usage models.

Yes and no. There is a design issue in logind, but this is based on the
design choice made in acpi/button: we use an input switch. A switch has
a state, and the kernel ought to report correct state. With this in
mind, the design choice in logind is correct. But now we are
seeing that some OEM are not providing everything we need, and we need
to find the solution.

> 
> > > So I already raised this to freedesktop:
> > > https://bugs.freedesktop.org/show_bug.cgi?id=100923
> > > But couldn't see any one there responding.
> > 
> > Well, Joachim answered, saying that there is likely a regression in
> > acpi/button.c, not in i915.
> 
> He surely can add more details as he is responsible of triage of the referenced bug reported on redhat.

(Joachim doesn't work for Red Hat, from what I can tell).

> But there is no agreement reached yet.
> 
> > > > > Can it determine that by its own without listening to the lid key events?
> > > >
> > > > Basically no. This switch is there for a reason. However, I am convinced
> > > > that a good heuristic is to say that if you are using the internal keyboard,
> > > > touchscreen or touchpad, unless the user has very very thin fingers, the
> > > > lid is open.
> > >
> > > I'm also convinced that the benefit of having a file in sysfs/procfs to allow user to know if the
> > lid is open is marginal.
> > 
> > I am convinced I don't get your point. We are obviously not talking
> > about the same thing here. I was talking about the physical world, where
> > the user interact with the laptop...
> 
> See my previous reply against accessing power buttons when lid is closed.
> 
> > ...
> Skip as it looks the result of debating here can also be concluded from the answers of the questions.
> 
> > > But it's just 1-2 months Bugzilla silence before seeing a different bug flood trend:
> > > This time, they are related to the external monitor usage model:
> > > https://bugzilla.kernel.org/show_bug.cgi?id=187271
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1430259
> > > https://bugzilla.kernel.org/show_bug.cgi?id=195455
> > > Now that you can see why I didn't send a patch to change the default behavior to "open" at the time
> > we were discussing.
> > > That's also why I think we needn't revert back to "method" as the default behavior.
> > 
> > Still, I strongly disagree. We do not fix 7 devices by breaking
> > hundreds. That's just not how the kernel work.
> 
> No. The word of "broken" is entirely wrong and emotional here.

Yes. There was bug before, and now there are regressions.

> If both user space and kernel space are changed to act according to button.lid_init_state=ignore.

Again, you can't expect a good synchronisation between kernel space and
user space without regressing some configurations (yes, it's a pain).

> No one will be broken.

'ignore' is not the solution, see above for the rationale.

> And no future laptops will be broken.

no comments.

> 
> > ...
> Skip as it looks the result of debating here can also be concluded from the answers of the questions.
> 
> > > > > After we root cause that's not a kernel problem, do we have mean in other community to handle
> > such
> > > > reports?
> > > > > For example, to collect hwdb entries.
> > > >
> > > > libinput, systemd are good candidate.
> > > > Libinput already has the bits in place, so I'd say we should probably
> > > > ask the users to report a bug on the wayland/libinput component of
> > > > bugs.freedesktop.org. But this will only work if the default initial lid
> > > > state is "method".
> > >
> > > Good to know. :)
> > > I've just changed the category of the forwarded report.
> > 
> > Well, I am not sure your bug report should have been changed. The bug
> > report was regarding i915 being confident in the _LID acpi call, and
> > that is not something libinput can change.
> 
> OK, I see.
> 
> > ...
> Skip as it looks the result of debating here can also be concluded from the answers of the questions.
> 

Thanks for your answer, however, you did not answered my questions:

Would you agree on:
- acking both my patches so Rafael can take them ans stable can apply
  them ASAP
- give me maintainership of drivers/acpi/button.c (ACK a following patch
  where I add me to MAINTAINERS)
- redirect all kernel bugs to me related to LID switch (or having me the
  default assignee for a new acpi/button component)
- allow me to add a list of quirk in acpi/button.c if I judge this as
  fit

If you agree on that, that would be good for both our sake I think.

Cheers,
Benjamin

> Cheers,
> Lv

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-12 21:50         ` Rafael J. Wysocki
@ 2017-05-15  7:45           ` Benjamin Tissoires
  2017-05-15  9:17             ` Rafael J. Wysocki
  0 siblings, 1 reply; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-15  7:45 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Zheng, Lv, Jiri Eischmann, linux-acpi, linux-kernel

On May 12 2017 or thereabouts, Rafael J. Wysocki wrote:
> On Friday, May 12, 2017 02:36:20 AM Zheng, Lv wrote:
> > Hi, 
> > 
> > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > > Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> > > 
> > > On May 11 2017 or thereabouts, Zheng, Lv wrote:
> > > > Hi,
> > > >
> > > > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > > > > Subject: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> > > > >
> > > > > This reverts commit 77e9a4aa9de10cc1418bf9a892366988802a8025.
> > > > >
> > > > > Even if the method implementation can be buggy on some platform,
> > > > > the "open" choice is worse. It breaks docking stations basically
> > > > > and there is no way to have a user-space hwdb to fix that.
> > > > >
> > > > > On the contrary, it's rather easy in user-space to have a hwdb
> > > > > with the problematic platforms. Then, libinput (1.7.0+) can fix
> > > > > the state of the LID switch for us: you need to set the udev
> > > > > property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
> > > > >
> > > > > When libinput detects internal keyboard events, it will
> > > > > overwrite the state of the switch to open, making it reliable
> > > > > again. Given that logind only checks the LID switch value after
> > > > > a timeout, we can assume the user will use the internal keyboard
> > > > > before this timeout expires.
> > > > >
> > > > > For example, such a hwdb entry is:
> > > > >
> > > > > libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
> > > > >  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
> > > >

[...]

> 
> Well, if it worked in a specific way that users depended on before the commit in
> question and now it works differently, then it does break things.
> 
> Benjamin, my understanding is that this is the case, is it correct?

That is correct. This patch I reverted introduces regression for professional
laptops that expect the LID switch to be reported accurately.

Cheers,
Benjamin

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-15  7:45           ` Benjamin Tissoires
@ 2017-05-15  9:17             ` Rafael J. Wysocki
  2017-05-15  9:37               ` Benjamin Tissoires
  0 siblings, 1 reply; 38+ messages in thread
From: Rafael J. Wysocki @ 2017-05-15  9:17 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Rafael J. Wysocki, Zheng, Lv, Jiri Eischmann, linux-acpi, linux-kernel

On Mon, May 15, 2017 at 9:45 AM, Benjamin Tissoires
<benjamin.tissoires@redhat.com> wrote:
> On May 12 2017 or thereabouts, Rafael J. Wysocki wrote:
>> On Friday, May 12, 2017 02:36:20 AM Zheng, Lv wrote:
>> > Hi,
>> >
>> > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
>> > > Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
>> > >
>> > > On May 11 2017 or thereabouts, Zheng, Lv wrote:
>> > > > Hi,
>> > > >
>> > > > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
>> > > > > Subject: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
>> > > > >
>> > > > > This reverts commit 77e9a4aa9de10cc1418bf9a892366988802a8025.
>> > > > >
>> > > > > Even if the method implementation can be buggy on some platform,
>> > > > > the "open" choice is worse. It breaks docking stations basically
>> > > > > and there is no way to have a user-space hwdb to fix that.
>> > > > >
>> > > > > On the contrary, it's rather easy in user-space to have a hwdb
>> > > > > with the problematic platforms. Then, libinput (1.7.0+) can fix
>> > > > > the state of the LID switch for us: you need to set the udev
>> > > > > property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
>> > > > >
>> > > > > When libinput detects internal keyboard events, it will
>> > > > > overwrite the state of the switch to open, making it reliable
>> > > > > again. Given that logind only checks the LID switch value after
>> > > > > a timeout, we can assume the user will use the internal keyboard
>> > > > > before this timeout expires.
>> > > > >
>> > > > > For example, such a hwdb entry is:
>> > > > >
>> > > > > libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
>> > > > >  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
>> > > >
>
> [...]
>
>>
>> Well, if it worked in a specific way that users depended on before the commit in
>> question and now it works differently, then it does break things.
>>
>> Benjamin, my understanding is that this is the case, is it correct?
>
> That is correct. This patch I reverted introduces regression for professional
> laptops that expect the LID switch to be reported accurately.

And from a user's perspective, what does not work any more?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-15  9:17             ` Rafael J. Wysocki
@ 2017-05-15  9:37               ` Benjamin Tissoires
  2017-05-15 11:01                 ` Rafael J. Wysocki
  0 siblings, 1 reply; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-15  9:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Zheng, Lv, Jiri Eischmann, linux-acpi, linux-kernel

On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
> On Mon, May 15, 2017 at 9:45 AM, Benjamin Tissoires
> <benjamin.tissoires@redhat.com> wrote:
> > On May 12 2017 or thereabouts, Rafael J. Wysocki wrote:
> >> On Friday, May 12, 2017 02:36:20 AM Zheng, Lv wrote:
> >> > Hi,
> >> >
> >> > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> >> > > Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> >> > >
> >> > > On May 11 2017 or thereabouts, Zheng, Lv wrote:
> >> > > > Hi,
> >> > > >
> >> > > > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> >> > > > > Subject: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> >> > > > >
> >> > > > > This reverts commit 77e9a4aa9de10cc1418bf9a892366988802a8025.
> >> > > > >
> >> > > > > Even if the method implementation can be buggy on some platform,
> >> > > > > the "open" choice is worse. It breaks docking stations basically
> >> > > > > and there is no way to have a user-space hwdb to fix that.
> >> > > > >
> >> > > > > On the contrary, it's rather easy in user-space to have a hwdb
> >> > > > > with the problematic platforms. Then, libinput (1.7.0+) can fix
> >> > > > > the state of the LID switch for us: you need to set the udev
> >> > > > > property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
> >> > > > >
> >> > > > > When libinput detects internal keyboard events, it will
> >> > > > > overwrite the state of the switch to open, making it reliable
> >> > > > > again. Given that logind only checks the LID switch value after
> >> > > > > a timeout, we can assume the user will use the internal keyboard
> >> > > > > before this timeout expires.
> >> > > > >
> >> > > > > For example, such a hwdb entry is:
> >> > > > >
> >> > > > > libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
> >> > > > >  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
> >> > > >
> >
> > [...]
> >
> >>
> >> Well, if it worked in a specific way that users depended on before the commit in
> >> question and now it works differently, then it does break things.
> >>
> >> Benjamin, my understanding is that this is the case, is it correct?
> >
> > That is correct. This patch I reverted introduces regression for professional
> > laptops that expect the LID switch to be reported accurately.
> 
> And from a user's perspective, what does not work any more?

If you boot or resume your laptop with the lid closed on a docking
station while using an external monitor connected to it, both internal
and external displays will light on, while only the external should.

There is a design choice in gdm to only provide the greater on the
internal display when lit on, so users only see a gray area on the
external monitor. Also, the cursor will not show up as it's by default
on the internal display too.

To "fix" that, users have to open the laptop once and close it once
again to sync the state of the switch with the hardware state.

Cheers,
Benjamin

> 
> Thanks,
> Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-15  9:37               ` Benjamin Tissoires
@ 2017-05-15 11:01                 ` Rafael J. Wysocki
  2017-05-15 13:05                   ` Benjamin Tissoires
  2017-05-24  8:08                   ` Benjamin Tissoires
  0 siblings, 2 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2017-05-15 11:01 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Zheng, Lv, Jiri Eischmann,
	linux-acpi, linux-kernel

On Mon, May 15, 2017 at 11:37 AM, Benjamin Tissoires
<benjamin.tissoires@redhat.com> wrote:
> On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
>> On Mon, May 15, 2017 at 9:45 AM, Benjamin Tissoires
>> <benjamin.tissoires@redhat.com> wrote:
>> > On May 12 2017 or thereabouts, Rafael J. Wysocki wrote:
>> >> On Friday, May 12, 2017 02:36:20 AM Zheng, Lv wrote:
>> >> > Hi,
>> >> >
>> >> > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
>> >> > > Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
>> >> > >
>> >> > > On May 11 2017 or thereabouts, Zheng, Lv wrote:
>> >> > > > Hi,
>> >> > > >
>> >> > > > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
>> >> > > > > Subject: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
>> >> > > > >
>> >> > > > > This reverts commit 77e9a4aa9de10cc1418bf9a892366988802a8025.
>> >> > > > >
>> >> > > > > Even if the method implementation can be buggy on some platform,
>> >> > > > > the "open" choice is worse. It breaks docking stations basically
>> >> > > > > and there is no way to have a user-space hwdb to fix that.
>> >> > > > >
>> >> > > > > On the contrary, it's rather easy in user-space to have a hwdb
>> >> > > > > with the problematic platforms. Then, libinput (1.7.0+) can fix
>> >> > > > > the state of the LID switch for us: you need to set the udev
>> >> > > > > property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
>> >> > > > >
>> >> > > > > When libinput detects internal keyboard events, it will
>> >> > > > > overwrite the state of the switch to open, making it reliable
>> >> > > > > again. Given that logind only checks the LID switch value after
>> >> > > > > a timeout, we can assume the user will use the internal keyboard
>> >> > > > > before this timeout expires.
>> >> > > > >
>> >> > > > > For example, such a hwdb entry is:
>> >> > > > >
>> >> > > > > libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
>> >> > > > >  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
>> >> > > >
>> >
>> > [...]
>> >
>> >>
>> >> Well, if it worked in a specific way that users depended on before the commit in
>> >> question and now it works differently, then it does break things.
>> >>
>> >> Benjamin, my understanding is that this is the case, is it correct?
>> >
>> > That is correct. This patch I reverted introduces regression for professional
>> > laptops that expect the LID switch to be reported accurately.
>>
>> And from a user's perspective, what does not work any more?
>
> If you boot or resume your laptop with the lid closed on a docking
> station while using an external monitor connected to it, both internal
> and external displays will light on, while only the external should.
>
> There is a design choice in gdm to only provide the greater on the
> internal display when lit on, so users only see a gray area on the
> external monitor. Also, the cursor will not show up as it's by default
> on the internal display too.
>
> To "fix" that, users have to open the laptop once and close it once
> again to sync the state of the switch with the hardware state.

OK

Yeah, that sucks.

So without the Lv's patch the behavior (on the systems in question) is
as expected, right?

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-15 11:01                 ` Rafael J. Wysocki
@ 2017-05-15 13:05                   ` Benjamin Tissoires
  2017-05-16  5:33                     ` Zheng, Lv
  2017-05-24  8:08                   ` Benjamin Tissoires
  1 sibling, 1 reply; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-15 13:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Zheng, Lv, Jiri Eischmann, linux-acpi, linux-kernel

On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
> On Mon, May 15, 2017 at 11:37 AM, Benjamin Tissoires
> <benjamin.tissoires@redhat.com> wrote:
> > On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
> >> On Mon, May 15, 2017 at 9:45 AM, Benjamin Tissoires
> >> <benjamin.tissoires@redhat.com> wrote:
> >> > On May 12 2017 or thereabouts, Rafael J. Wysocki wrote:
> >> >> On Friday, May 12, 2017 02:36:20 AM Zheng, Lv wrote:
> >> >> > Hi,
> >> >> >
> >> >> > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> >> >> > > Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> >> >> > >
> >> >> > > On May 11 2017 or thereabouts, Zheng, Lv wrote:
> >> >> > > > Hi,
> >> >> > > >
> >> >> > > > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> >> >> > > > > Subject: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> >> >> > > > >
> >> >> > > > > This reverts commit 77e9a4aa9de10cc1418bf9a892366988802a8025.
> >> >> > > > >
> >> >> > > > > Even if the method implementation can be buggy on some platform,
> >> >> > > > > the "open" choice is worse. It breaks docking stations basically
> >> >> > > > > and there is no way to have a user-space hwdb to fix that.
> >> >> > > > >
> >> >> > > > > On the contrary, it's rather easy in user-space to have a hwdb
> >> >> > > > > with the problematic platforms. Then, libinput (1.7.0+) can fix
> >> >> > > > > the state of the LID switch for us: you need to set the udev
> >> >> > > > > property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
> >> >> > > > >
> >> >> > > > > When libinput detects internal keyboard events, it will
> >> >> > > > > overwrite the state of the switch to open, making it reliable
> >> >> > > > > again. Given that logind only checks the LID switch value after
> >> >> > > > > a timeout, we can assume the user will use the internal keyboard
> >> >> > > > > before this timeout expires.
> >> >> > > > >
> >> >> > > > > For example, such a hwdb entry is:
> >> >> > > > >
> >> >> > > > > libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
> >> >> > > > >  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
> >> >> > > >
> >> >
> >> > [...]
> >> >
> >> >>
> >> >> Well, if it worked in a specific way that users depended on before the commit in
> >> >> question and now it works differently, then it does break things.
> >> >>
> >> >> Benjamin, my understanding is that this is the case, is it correct?
> >> >
> >> > That is correct. This patch I reverted introduces regression for professional
> >> > laptops that expect the LID switch to be reported accurately.
> >>
> >> And from a user's perspective, what does not work any more?
> >
> > If you boot or resume your laptop with the lid closed on a docking
> > station while using an external monitor connected to it, both internal
> > and external displays will light on, while only the external should.
> >
> > There is a design choice in gdm to only provide the greater on the
> > internal display when lit on, so users only see a gray area on the
> > external monitor. Also, the cursor will not show up as it's by default
> > on the internal display too.
> >
> > To "fix" that, users have to open the laptop once and close it once
> > again to sync the state of the switch with the hardware state.
> 
> OK
> 
> Yeah, that sucks.
> 
> So without the Lv's patch the behavior (on the systems in question) is
> as expected, right?
> 

Yes, reverting these 2 patches restores the pre v4.11 kernel behavior.

Cheers,
Benjamin

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode"
  2017-05-15  7:42               ` Benjamin Tissoires
@ 2017-05-16  5:05                 ` Zheng, Lv
  0 siblings, 0 replies; 38+ messages in thread
From: Zheng, Lv @ 2017-05-16  5:05 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Rafael J . Wysocki, Jiri Eischmann, linux-acpi, linux-kernel

Hi, Benjamin

I reordered the discussion to collect topics and delete things to make discussion shorter.

1. root caused issue:

> > It seems we just need to determine the following first:
> > 1. Who should be responsible for solving bugs triggered by the conflict between bios and linux user
> space expectations:
> >    button driver? libinput? Some other user space programs? Users?
> Hopefully libinput or systemd (through a udev rule). If things gets
> worse a acpi/button quirk might be used, but in a second time.

I have concerns about what's in your mind. :)

So let me high light a root caused issue:
https://bugzilla.kernel.org/show_bug.cgi?id=106151
If we use any "open" modes, the suspend/resume loop can be fixed.
Both "ignore/method" modes cannot fix the problem.
In this bug, lid open event has a huge delay. But it can correctly arrive.
However systemd will force 2nd suspend if it cannot see "open" event instantly after resume.
So why don't systemd fix the issue (the enforcement) prior than letting us (input layer/button driver) to invent workarounds?
IMO, this is a root caused problem and should be the first priority.
Before seeing it is addressed in systemd, any changes made to libinput/button driver may not be proper.

So the order of fixing all troubles are the followings in my mind:
1. system -> should eliminate the enforcement first
2. libinput -> may change lid event type (see my reply below for topic 2)
3. button driver

> > > > > The issue we are fixing here is the fact that the switch state is wrong,
> > > > > which makes user space assumptions wrong too (and users angry).
> > > > Considering no platform quirks.
> > > > If ACPI button driver sends SW_LID, users are likely angry.
> > > > Unless the user space programs are changed to "ignore open event".
> > > > If ACPI button driver doesn't send switch events, but key events.
> > > > The user space programs need to change to handle the new events.
> > > > So finally whatever we do, user space need to change a bit related to ACPI control method lid
> device.
> > > Or we fix the switch to report accurate events/state.
> > > You do realise that all the energy you are spending, answering to me,
> > > talking to user space maintainers, users, all comes down to the fact
> > > that you refuse to have hardware quirks?

Yes, as we have a root caused but unfixed issue in systemd first.
It's pointless to introduce hardware quirks at this point.

> > Thus there is no possible __FIX__ for acpi button driver and libinput.
> I never talked about a fix. I know the situation is unsolvable, which is
> why I talked about quirks or workarounds.
> > While user space programs can just fix their usage models.
> You can't expect user space to change anything from the kernel point of
> view without a long enough warning.

Why cannot we expect so?
The above issue has already been root caused.

> > > > > > However, is that possible to not introduce platform quirks?
> > > > > > For example, for systemd:
> > > > > > If it detected ACPI lid device, automatically switch to an "IgnoreLidOpenEvent" mode (like
> > > nouveau
> > > > > drivers' ignorelid=Y option).
> > > > > Well, given that 99.9% of laptops have this ACPI lid button, you'll just
> > > > > remove the feature from all laptops.
> > > > No, I only removed the wrong usage models related to the strict "open" event.
> > > > All laptops are still capable of sending correct "close" event.
> > > My bad, I read too fast and missed the "...Open..." part of
> > > "IgnoreLidOpenEvent".
> > > Though I am not sure IgnoreLidOpenEvent is accurate.
> > > "OpenEventNeverSent" seems to reflect more the reality. But again,
> > > that's not of our (kernel developers) business.
> > IMO this is the only root cause fix. :)
> > It's the only way that the user can use without changing its quirk modes for different usage models.
> Yes and no. There is a design issue in logind, but this is based on the
> design choice made in acpi/button: we use an input switch. A switch has
> a state, and the kernel ought to report correct state. With this in
> mind, the design choice in logind is correct. But now we are
> seeing that some OEM are not providing everything we need, and we need
> to find the solution.

We can stop arguing this.
First of all, we just need to check if systemd can remove the enforcement mentioned above.
And what will happen next.
It's likely that there will be no user issues or button driver design issues then.
Thus finally we may needn't introduce any "hardware quirks".

2. keep "method" mode:

> > Button driver default behavior should be (not 100% sure if this is your opinion):
> button.lid_init_state=method
> 
> Yes, I'd like to revert to the old behavior (see below for a rationale).
...
> > 2. What should be the default button driver behavior?
> >    button.lid_init_state=ignore? button.lid_init_state=method?
> button.lid_init_state=method:
> - this was the old default state, and switching to something else
>   creates regression
> - the lid switch input node is an EV_SWITCH. And this has a state we
>   need to synchronise with the hardware first. The default EV_SW state
>   is closed, so if we do not sync it first, we might report wrong state
>   to user space.
...
> >    If we can use libinput to manage platform quirks, then button.lid_init_state=method also looks
> useless.
> No. 'method' is the only way to guarantee the exported input node is
> synced properly with the actual physical world.
> >    The question is: Should button.lid_init_state=method be kept?
> Definitively.

Don't you think we actually have a new feature here?
See, many years ago, all power buttons are on keyboard.
Now all power buttons are on the edge, and many laptops have docking station.
Don't you think we need to re-define the input event type of lid switch?
Or we need to drive users of SW_LID/ACPI lid event to be not so strict to its switch event nature.

So if SW_LID itself or its users are changed in this direction, will that really be useful to keep "method"?

> Switching back to button.lid_init_state=method will prevent regressions
> to be filled. If a user buys a new machine and the LID switch doesn't
> work as expected, it's not a regression, it's a bug. Regression is
> something we absolutely need to avoid, and switching to
> button.lid_init_state=open introduced far too many regressions on
> systems.

IMO, whether "method" can be the default behavior depending on 2 factors:
1. We are discussing on freedesktop to see if things can be improved in graphics world;
2. topic 1 issue can be fixed.
If graphics world is not changed and issue 1 is fixed, we can use "method" mode.
If graphics world is changed and issue 1 is fixed, we can use "ignore" mode.
If graphics world is changed and issue 1 is not fixed, we should stay using "open" mode.
So it might be pointless to determine the default behavior such early.
Given users can work things around using boot parameters.
Aggressively making such a change now just make bug fixing work going back and forth.

3. keep "open/close" mode

> We should keep button.lid_init_state=open:
> - it's kernel API now, so you can't remove it without breaking users
>   configuration
> - it helps booting laptops that are not known to be working, the time
>   the user installs the distribution and add the hwdb entry to fix it.
> > If user space programs have special needs, they can fix them on their own, via the following mean:
> >  libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
> >   LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
> > From this point of view, we actually don't need open/close modes.
> We don't need close, but we need to keep open, for 2 reasons I'll detail
> below.
> > 3. If non button driver quirks are working, button driver quirk modes are useless.
> >    The question is: Should button.lid_init_state=open/close be kept?
> we should not keep button.lid_init_state=close:
> - I don't see a good use case where it is needed, besides debugging
>   drivers that should be debugged in an other way.

If we presuppose that SW_LID cannot change, we surely need quirk modes for some new usage model issues.
IMO,
If quirks are done in libinput, then we needn't "open/close".
If button driver need to provide quirks on its own, then we need both "open/close" to be able to deal with all cases.
I already listed a use case that "open/method" cannot deal with:
No lid notification and _LID return a cached "open" value after boot/resume.
External monitor is connected and lid is closed.
But we haven't reached an agreement yet.
So please answer the following question.
Could you tell me what the quirk mode (button driver mode or libinput node writes, whatever) should we use for the following cases:
1. No lid notification and _LID return value is a cached open after boot/resume
   1.1. External monitor is connected, lid is closed
   1.2. External monitor is connected, lid is opened
2. No lid notification and _LID return value is a cached close after boot/resume
   2.1. External monitor is connected, lid is closed
   2.2. External monitor is connected, lid is opened

4. quirk tables

> > > If you don't want to write/maintain the code, fine. I can do it, but
> > > please stop trying to make everyone else change because you just don't
> > > want the burden of quirking a handful of laptops.
> > See the base line from my side is very clear:
> > If acpi community need to handle such bug reports, button.lid_init_state=method cannot be the
> default behavior.
> I already proposed to handle those reports. I don't see why you are
> concerned about those future reports.
> > So finally we actually cannot fix anything by maintaining the quirk table, but just create a
> business of maintaining the quirk table.
> Yes, but I never asked you to maintain such table. I am volunteering to
> do that if needed.
> Would you agree on:
> - acking both my patches so Rafael can take them ans stable can apply
>   them ASAP
> - give me maintainership of drivers/acpi/button.c (ACK a following patch
>   where I add me to MAINTAINERS)
> - redirect all kernel bugs to me related to LID switch (or having me the
>   default assignee for a new acpi/button component)
> - allow me to add a list of quirk in acpi/button.c if I judge this as
>   fit

As reasons mentioned in topic 1 and 2.
For now, I'd rather to see progresses made in systemd/libinput first.

Actually there is no special maintainership related to the button driver.
All drivers of ACPI spec defined devices (PNPxxx stuffs) are maintained as a whole by Rafael.
And we are helping him by doing triage on kernel Bugzilla.

There isn't a category on kernel Bugzilla related to lid issues.
Probably you can help to create one under ACPI product.
If this can be achieved, you can be the default assignee for such issues.
If this cannot be achieved, we'll just forward some lid reports to you first.
Please tell me which product/component you'd prefer for us to forward.

For modifying acpi/button.c, it's open source, you can send patches and ACPI community can help to review.

Thanks and best regards
Lv

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-15 13:05                   ` Benjamin Tissoires
@ 2017-05-16  5:33                     ` Zheng, Lv
  2017-05-16  5:47                         ` Zheng, Lv
  0 siblings, 1 reply; 38+ messages in thread
From: Zheng, Lv @ 2017-05-16  5:33 UTC (permalink / raw)
  To: Benjamin Tissoires, Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Jiri Eischmann, linux-acpi, linux-kernel

Hi, Guys

> From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> 
> On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
> > On Mon, May 15, 2017 at 11:37 AM, Benjamin Tissoires
> > <benjamin.tissoires@redhat.com> wrote:
> > > On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
> > >> On Mon, May 15, 2017 at 9:45 AM, Benjamin Tissoires
> > >> <benjamin.tissoires@redhat.com> wrote:
> > >> > On May 12 2017 or thereabouts, Rafael J. Wysocki wrote:
> > >> >> On Friday, May 12, 2017 02:36:20 AM Zheng, Lv wrote:
> > >> >> > Hi,
> > >> >> >
> > >> >> > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > >> >> > > Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to
> lid_init_state=open"
> > >> >> > >
> > >> >> > > On May 11 2017 or thereabouts, Zheng, Lv wrote:
> > >> >> > > > Hi,
> > >> >> > > >
> > >> >> > > > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > >> >> > > > > Subject: [PATCH 2/2] Revert "ACPI / button: Change default behavior to
> lid_init_state=open"
> > >> >> > > > >
> > >> >> > > > > This reverts commit 77e9a4aa9de10cc1418bf9a892366988802a8025.
> > >> >> > > > >
> > >> >> > > > > Even if the method implementation can be buggy on some platform,
> > >> >> > > > > the "open" choice is worse. It breaks docking stations basically
> > >> >> > > > > and there is no way to have a user-space hwdb to fix that.
> > >> >> > > > >
> > >> >> > > > > On the contrary, it's rather easy in user-space to have a hwdb
> > >> >> > > > > with the problematic platforms. Then, libinput (1.7.0+) can fix
> > >> >> > > > > the state of the LID switch for us: you need to set the udev
> > >> >> > > > > property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
> > >> >> > > > >
> > >> >> > > > > When libinput detects internal keyboard events, it will
> > >> >> > > > > overwrite the state of the switch to open, making it reliable
> > >> >> > > > > again. Given that logind only checks the LID switch value after
> > >> >> > > > > a timeout, we can assume the user will use the internal keyboard
> > >> >> > > > > before this timeout expires.
> > >> >> > > > >
> > >> >> > > > > For example, such a hwdb entry is:
> > >> >> > > > >
> > >> >> > > > > libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
> > >> >> > > > >  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
> > >> >> > > >
> > >> >
> > >> > [...]
> > >> >
> > >> >>
> > >> >> Well, if it worked in a specific way that users depended on before the commit in
> > >> >> question and now it works differently, then it does break things.
> > >> >>
> > >> >> Benjamin, my understanding is that this is the case, is it correct?
> > >> >
> > >> > That is correct. This patch I reverted introduces regression for professional
> > >> > laptops that expect the LID switch to be reported accurately.
> > >>
> > >> And from a user's perspective, what does not work any more?
> > >
> > > If you boot or resume your laptop with the lid closed on a docking
> > > station while using an external monitor connected to it, both internal
> > > and external displays will light on, while only the external should.
> > >
> > > There is a design choice in gdm to only provide the greater on the
> > > internal display when lit on, so users only see a gray area on the
> > > external monitor. Also, the cursor will not show up as it's by default
> > > on the internal display too.
> > >
> > > To "fix" that, users have to open the laptop once and close it once
> > > again to sync the state of the switch with the hardware state.
> >
> > OK
> >
> > Yeah, that sucks.
> >
> > So without the Lv's patch the behavior (on the systems in question) is
> > as expected, right?
> >
> 
> Yes, reverting these 2 patches restores the pre v4.11 kernel behavior.

I would make an argument that:
A. Is this necessarily a button driver regression?
1. Users already configured to not using internal display, why gdm need to determine it again instead of users choice?
2. Can gdm/graphics driver saves state before suspend, and restores saved state after resume?
   If users didn't change state during suspend, then everything should be correct.
   If users changed state during suspend, it should be acceptable for users to change it again to make the state correct.
See, this is obviously a case that is not so strictly related to ACPI button driver.
Why do we need to force button driver to marry external monitors.
B. Bug reporters are all ok with using quirk modes as boot parameters to work this around.
Why should we change our default behavior aimlessly?

Thanks and best regards
Lv

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-16  5:33                     ` Zheng, Lv
@ 2017-05-16  5:47                         ` Zheng, Lv
  0 siblings, 0 replies; 38+ messages in thread
From: Zheng, Lv @ 2017-05-16  5:47 UTC (permalink / raw)
  To: Zheng, Lv, Benjamin Tissoires, Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Jiri Eischmann, linux-acpi, linux-kernel

Hi,

> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Zheng,
> Lv
> Subject: RE: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> 
> Hi, Guys
> 
> > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> >
> > On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
> > > On Mon, May 15, 2017 at 11:37 AM, Benjamin Tissoires
> > > <benjamin.tissoires@redhat.com> wrote:
> > > > On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
> > > >> On Mon, May 15, 2017 at 9:45 AM, Benjamin Tissoires
> > > >> <benjamin.tissoires@redhat.com> wrote:
> > > >> > On May 12 2017 or thereabouts, Rafael J. Wysocki wrote:
> > > >> >> On Friday, May 12, 2017 02:36:20 AM Zheng, Lv wrote:
> > > >> >> > Hi,
> > > >> >> >
> > > >> >> > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > > >> >> > > Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to
> > lid_init_state=open"
> > > >> >> > >
> > > >> >> > > On May 11 2017 or thereabouts, Zheng, Lv wrote:
> > > >> >> > > > Hi,
> > > >> >> > > >
> > > >> >> > > > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > > >> >> > > > > Subject: [PATCH 2/2] Revert "ACPI / button: Change default behavior to
> > lid_init_state=open"
> > > >> >> > > > >
> > > >> >> > > > > This reverts commit 77e9a4aa9de10cc1418bf9a892366988802a8025.
> > > >> >> > > > >
> > > >> >> > > > > Even if the method implementation can be buggy on some platform,
> > > >> >> > > > > the "open" choice is worse. It breaks docking stations basically
> > > >> >> > > > > and there is no way to have a user-space hwdb to fix that.
> > > >> >> > > > >
> > > >> >> > > > > On the contrary, it's rather easy in user-space to have a hwdb
> > > >> >> > > > > with the problematic platforms. Then, libinput (1.7.0+) can fix
> > > >> >> > > > > the state of the LID switch for us: you need to set the udev
> > > >> >> > > > > property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
> > > >> >> > > > >
> > > >> >> > > > > When libinput detects internal keyboard events, it will
> > > >> >> > > > > overwrite the state of the switch to open, making it reliable
> > > >> >> > > > > again. Given that logind only checks the LID switch value after
> > > >> >> > > > > a timeout, we can assume the user will use the internal keyboard
> > > >> >> > > > > before this timeout expires.
> > > >> >> > > > >
> > > >> >> > > > > For example, such a hwdb entry is:
> > > >> >> > > > >
> > > >> >> > > > > libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
> > > >> >> > > > >  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
> > > >> >> > > >
> > > >> >
> > > >> > [...]
> > > >> >
> > > >> >>
> > > >> >> Well, if it worked in a specific way that users depended on before the commit in
> > > >> >> question and now it works differently, then it does break things.
> > > >> >>
> > > >> >> Benjamin, my understanding is that this is the case, is it correct?
> > > >> >
> > > >> > That is correct. This patch I reverted introduces regression for professional
> > > >> > laptops that expect the LID switch to be reported accurately.
> > > >>
> > > >> And from a user's perspective, what does not work any more?
> > > >
> > > > If you boot or resume your laptop with the lid closed on a docking
> > > > station while using an external monitor connected to it, both internal
> > > > and external displays will light on, while only the external should.
> > > >
> > > > There is a design choice in gdm to only provide the greater on the
> > > > internal display when lit on, so users only see a gray area on the
> > > > external monitor. Also, the cursor will not show up as it's by default
> > > > on the internal display too.
> > > >
> > > > To "fix" that, users have to open the laptop once and close it once
> > > > again to sync the state of the switch with the hardware state.
> > >
> > > OK
> > >
> > > Yeah, that sucks.
> > >
> > > So without the Lv's patch the behavior (on the systems in question) is
> > > as expected, right?
> > >
> >
> > Yes, reverting these 2 patches restores the pre v4.11 kernel behavior.
> 
> I would make an argument that:
> A. Is this necessarily a button driver regression?
> 1. Users already configured to not using internal display, why gdm need to determine it again instead
> of users choice?
> 2. Can gdm/graphics driver saves state before suspend, and restores saved state after resume?
>    If users didn't change state during suspend, then everything should be correct.
>    If users changed state during suspend, it should be acceptable for users to change it again to make
> the state correct.
> See, this is obviously a case that is not so strictly related to ACPI button driver.
> Why do we need to force button driver to marry external monitors.
> B. Bug reporters are all ok with using quirk modes as boot parameters to work this around.
> Why should we change our default behavior aimlessly?

I have one more concern:
In button.lid_init_state=method mode,
Is that possible for libinput to work things around if _LID return value is not correct?
How libinput ensures correct timing of overwriting the input node value?
Will button driver faked event value overwrites what libinput has written?

From this point of view, button.lid_init_state=ignore might be a better choice than button.lid_init_state=method to allow libinput to deal with all kind of cases.

Thanks and best regards
Lv

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
@ 2017-05-16  5:47                         ` Zheng, Lv
  0 siblings, 0 replies; 38+ messages in thread
From: Zheng, Lv @ 2017-05-16  5:47 UTC (permalink / raw)
  To: Zheng, Lv, Benjamin Tissoires, Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Jiri Eischmann, linux-acpi, linux-kernel

Hi,

> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Zheng,
> Lv
> Subject: RE: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> 
> Hi, Guys
> 
> > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> >
> > On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
> > > On Mon, May 15, 2017 at 11:37 AM, Benjamin Tissoires
> > > <benjamin.tissoires@redhat.com> wrote:
> > > > On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
> > > >> On Mon, May 15, 2017 at 9:45 AM, Benjamin Tissoires
> > > >> <benjamin.tissoires@redhat.com> wrote:
> > > >> > On May 12 2017 or thereabouts, Rafael J. Wysocki wrote:
> > > >> >> On Friday, May 12, 2017 02:36:20 AM Zheng, Lv wrote:
> > > >> >> > Hi,
> > > >> >> >
> > > >> >> > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > > >> >> > > Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to
> > lid_init_state=open"
> > > >> >> > >
> > > >> >> > > On May 11 2017 or thereabouts, Zheng, Lv wrote:
> > > >> >> > > > Hi,
> > > >> >> > > >
> > > >> >> > > > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > > >> >> > > > > Subject: [PATCH 2/2] Revert "ACPI / button: Change default behavior to
> > lid_init_state=open"
> > > >> >> > > > >
> > > >> >> > > > > This reverts commit 77e9a4aa9de10cc1418bf9a892366988802a8025.
> > > >> >> > > > >
> > > >> >> > > > > Even if the method implementation can be buggy on some platform,
> > > >> >> > > > > the "open" choice is worse. It breaks docking stations basically
> > > >> >> > > > > and there is no way to have a user-space hwdb to fix that.
> > > >> >> > > > >
> > > >> >> > > > > On the contrary, it's rather easy in user-space to have a hwdb
> > > >> >> > > > > with the problematic platforms. Then, libinput (1.7.0+) can fix
> > > >> >> > > > > the state of the LID switch for us: you need to set the udev
> > > >> >> > > > > property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
> > > >> >> > > > >
> > > >> >> > > > > When libinput detects internal keyboard events, it will
> > > >> >> > > > > overwrite the state of the switch to open, making it reliable
> > > >> >> > > > > again. Given that logind only checks the LID switch value after
> > > >> >> > > > > a timeout, we can assume the user will use the internal keyboard
> > > >> >> > > > > before this timeout expires.
> > > >> >> > > > >
> > > >> >> > > > > For example, such a hwdb entry is:
> > > >> >> > > > >
> > > >> >> > > > > libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
> > > >> >> > > > >  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
> > > >> >> > > >
> > > >> >
> > > >> > [...]
> > > >> >
> > > >> >>
> > > >> >> Well, if it worked in a specific way that users depended on before the commit in
> > > >> >> question and now it works differently, then it does break things.
> > > >> >>
> > > >> >> Benjamin, my understanding is that this is the case, is it correct?
> > > >> >
> > > >> > That is correct. This patch I reverted introduces regression for professional
> > > >> > laptops that expect the LID switch to be reported accurately.
> > > >>
> > > >> And from a user's perspective, what does not work any more?
> > > >
> > > > If you boot or resume your laptop with the lid closed on a docking
> > > > station while using an external monitor connected to it, both internal
> > > > and external displays will light on, while only the external should.
> > > >
> > > > There is a design choice in gdm to only provide the greater on the
> > > > internal display when lit on, so users only see a gray area on the
> > > > external monitor. Also, the cursor will not show up as it's by default
> > > > on the internal display too.
> > > >
> > > > To "fix" that, users have to open the laptop once and close it once
> > > > again to sync the state of the switch with the hardware state.
> > >
> > > OK
> > >
> > > Yeah, that sucks.
> > >
> > > So without the Lv's patch the behavior (on the systems in question) is
> > > as expected, right?
> > >
> >
> > Yes, reverting these 2 patches restores the pre v4.11 kernel behavior.
> 
> I would make an argument that:
> A. Is this necessarily a button driver regression?
> 1. Users already configured to not using internal display, why gdm need to determine it again instead
> of users choice?
> 2. Can gdm/graphics driver saves state before suspend, and restores saved state after resume?
>    If users didn't change state during suspend, then everything should be correct.
>    If users changed state during suspend, it should be acceptable for users to change it again to make
> the state correct.
> See, this is obviously a case that is not so strictly related to ACPI button driver.
> Why do we need to force button driver to marry external monitors.
> B. Bug reporters are all ok with using quirk modes as boot parameters to work this around.
> Why should we change our default behavior aimlessly?

I have one more concern:
In button.lid_init_state=method mode,
Is that possible for libinput to work things around if _LID return value is not correct?
How libinput ensures correct timing of overwriting the input node value?
Will button driver faked event value overwrites what libinput has written?

>From this point of view, button.lid_init_state=ignore might be a better choice than button.lid_init_state=method to allow libinput to deal with all kind of cases.

Thanks and best regards
Lv

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-16  5:47                         ` Zheng, Lv
  (?)
@ 2017-05-16  7:15                         ` Benjamin Tissoires
  2017-05-16  8:30                           ` Zheng, Lv
  -1 siblings, 1 reply; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-16  7:15 UTC (permalink / raw)
  To: Zheng, Lv
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Jiri Eischmann, linux-acpi,
	linux-kernel

On May 16 2017 or thereabouts, Zheng, Lv wrote:
> Hi,
> 
> > From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-owner@vger.kernel.org] On Behalf Of Zheng,
> > Lv
> > Subject: RE: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> > 
> > Hi, Guys
> > 
> > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > > Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
> > >
> > > On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
> > > > On Mon, May 15, 2017 at 11:37 AM, Benjamin Tissoires
> > > > <benjamin.tissoires@redhat.com> wrote:
> > > > > On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
> > > > >> On Mon, May 15, 2017 at 9:45 AM, Benjamin Tissoires
> > > > >> <benjamin.tissoires@redhat.com> wrote:
> > > > >> > On May 12 2017 or thereabouts, Rafael J. Wysocki wrote:
> > > > >> >> On Friday, May 12, 2017 02:36:20 AM Zheng, Lv wrote:
> > > > >> >> > Hi,
> > > > >> >> >
> > > > >> >> > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > > > >> >> > > Subject: Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to
> > > lid_init_state=open"
> > > > >> >> > >
> > > > >> >> > > On May 11 2017 or thereabouts, Zheng, Lv wrote:
> > > > >> >> > > > Hi,
> > > > >> >> > > >
> > > > >> >> > > > > From: Benjamin Tissoires [mailto:benjamin.tissoires@redhat.com]
> > > > >> >> > > > > Subject: [PATCH 2/2] Revert "ACPI / button: Change default behavior to
> > > lid_init_state=open"
> > > > >> >> > > > >
> > > > >> >> > > > > This reverts commit 77e9a4aa9de10cc1418bf9a892366988802a8025.
> > > > >> >> > > > >
> > > > >> >> > > > > Even if the method implementation can be buggy on some platform,
> > > > >> >> > > > > the "open" choice is worse. It breaks docking stations basically
> > > > >> >> > > > > and there is no way to have a user-space hwdb to fix that.
> > > > >> >> > > > >
> > > > >> >> > > > > On the contrary, it's rather easy in user-space to have a hwdb
> > > > >> >> > > > > with the problematic platforms. Then, libinput (1.7.0+) can fix
> > > > >> >> > > > > the state of the LID switch for us: you need to set the udev
> > > > >> >> > > > > property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'.
> > > > >> >> > > > >
> > > > >> >> > > > > When libinput detects internal keyboard events, it will
> > > > >> >> > > > > overwrite the state of the switch to open, making it reliable
> > > > >> >> > > > > again. Given that logind only checks the LID switch value after
> > > > >> >> > > > > a timeout, we can assume the user will use the internal keyboard
> > > > >> >> > > > > before this timeout expires.
> > > > >> >> > > > >
> > > > >> >> > > > > For example, such a hwdb entry is:
> > > > >> >> > > > >
> > > > >> >> > > > > libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
> > > > >> >> > > > >  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
> > > > >> >> > > >
> > > > >> >
> > > > >> > [...]
> > > > >> >
> > > > >> >>
> > > > >> >> Well, if it worked in a specific way that users depended on before the commit in
> > > > >> >> question and now it works differently, then it does break things.
> > > > >> >>
> > > > >> >> Benjamin, my understanding is that this is the case, is it correct?
> > > > >> >
> > > > >> > That is correct. This patch I reverted introduces regression for professional
> > > > >> > laptops that expect the LID switch to be reported accurately.
> > > > >>
> > > > >> And from a user's perspective, what does not work any more?
> > > > >
> > > > > If you boot or resume your laptop with the lid closed on a docking
> > > > > station while using an external monitor connected to it, both internal
> > > > > and external displays will light on, while only the external should.
> > > > >
> > > > > There is a design choice in gdm to only provide the greater on the
> > > > > internal display when lit on, so users only see a gray area on the
> > > > > external monitor. Also, the cursor will not show up as it's by default
> > > > > on the internal display too.
> > > > >
> > > > > To "fix" that, users have to open the laptop once and close it once
> > > > > again to sync the state of the switch with the hardware state.
> > > >
> > > > OK
> > > >
> > > > Yeah, that sucks.
> > > >
> > > > So without the Lv's patch the behavior (on the systems in question) is
> > > > as expected, right?
> > > >
> > >
> > > Yes, reverting these 2 patches restores the pre v4.11 kernel behavior.
> > 
> > I would make an argument that:
> > A. Is this necessarily a button driver regression?
> > 1. Users already configured to not using internal display, why gdm need to determine it again instead
> > of users choice?
> > 2. Can gdm/graphics driver saves state before suspend, and restores saved state after resume?
> >    If users didn't change state during suspend, then everything should be correct.
> >    If users changed state during suspend, it should be acceptable for users to change it again to make
> > the state correct.
> > See, this is obviously a case that is not so strictly related to ACPI button driver.
> > Why do we need to force button driver to marry external monitors.
> > B. Bug reporters are all ok with using quirk modes as boot parameters to work this around.
> > Why should we change our default behavior aimlessly?
> 
> I have one more concern:
> In button.lid_init_state=method mode,
> Is that possible for libinput to work things around if _LID return value is not correct?
> How libinput ensures correct timing of overwriting the input node value?
> Will button driver faked event value overwrites what libinput has written?
> 
> From this point of view, button.lid_init_state=ignore might be a better choice than button.lid_init_state=method to allow libinput to deal with all kind of cases.
> 

This is my last email on this topic, I don't even want to fully read/answer
the one in 1/2 given the amount of bad faith you put in that.

This is a REGRESSION. It used to work on thousands of devices, it
doesn't anymore. So any regression has to be chased down and no good
reason can justify such a regression.

The only solution is to revert both these changes. We can not ask user
space to fix a kernel regression, it's not how it works.

You can not also change the semantic of an input switch. An input
switch, as per the input subsystem is supposed to forward an actual
state of the underlying hardware. Any fake information is bad and has to
be avoided.

I already gave you 2 solutions to fix the 7 machines you see that are
problematic, and you just seem to ignore them:
- revert to the v4.10 behavior and let libinput fix that for you
- revert to the v4.10 behavior and have a quirk database in acpi/button

I also proposed to take maintainership on this particular module because
you said you were assigned this by default because you were the last
introducing changes in it. I asked you twice, and two times you replied
skipping this part.

It's clear you don't want to revert to the old state, and even if I can
prove to you that you have to, you don't listen.

So please, do not force me to call the maintainers and Linus on this
simple 2 reverts.

Cheers,
Benjamin

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-16  7:15                         ` Benjamin Tissoires
@ 2017-05-16  8:30                           ` Zheng, Lv
  2017-05-16 10:10                             ` Benjamin Tissoires
  0 siblings, 1 reply; 38+ messages in thread
From: Zheng, Lv @ 2017-05-16  8:30 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Jiri Eischmann, linux-acpi,
	linux-kernel

Hi, Benjamin

> > > > > >> >> > > > > For example, such a hwdb entry is:
> > > > > >> >> > > > > libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
> > > > > >> >> > > > >  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
> > > > > >> >> Well, if it worked in a specific way that users depended on before the commit in
> > > > > >> >> question and now it works differently, then it does break things.
> > > > > >> >> Benjamin, my understanding is that this is the case, is it correct?
> > > > > >> > That is correct. This patch I reverted introduces regression for professional
> > > > > >> > laptops that expect the LID switch to be reported accurately.
> > > > > >> And from a user's perspective, what does not work any more?
> > > > > > If you boot or resume your laptop with the lid closed on a docking
> > > > > > station while using an external monitor connected to it, both internal
> > > > > > and external displays will light on, while only the external should.
> > > > > > There is a design choice in gdm to only provide the greater on the
> > > > > > internal display when lit on, so users only see a gray area on the
> > > > > > external monitor. Also, the cursor will not show up as it's by default
> > > > > > on the internal display too.
> > > > > > To "fix" that, users have to open the laptop once and close it once
> > > > > > again to sync the state of the switch with the hardware state.
> > > > > OK
> > > > > Yeah, that sucks.
> > > > > So without the Lv's patch the behavior (on the systems in question) is
> > > > > as expected, right?
> > > > Yes, reverting these 2 patches restores the pre v4.11 kernel behavior.
> > > I would make an argument that:
> > > A. Is this necessarily a button driver regression?
> > > 1. Users already configured to not using internal display, why gdm need to determine it again
> instead
> > > of users choice?
> > > 2. Can gdm/graphics driver saves state before suspend, and restores saved state after resume?
> > >    If users didn't change state during suspend, then everything should be correct.
> > >    If users changed state during suspend, it should be acceptable for users to change it again to
> make
> > > the state correct.
> > > See, this is obviously a case that is not so strictly related to ACPI button driver.
> > > Why do we need to force button driver to marry external monitors.
> > > B. Bug reporters are all ok with using quirk modes as boot parameters to work this around.
> > > Why should we change our default behavior aimlessly?
> >
> > I have one more concern:
> > In button.lid_init_state=method mode,
> > Is that possible for libinput to work things around if _LID return value is not correct?
> > How libinput ensures correct timing of overwriting the input node value?
> > Will button driver faked event value overwrites what libinput has written?
> >
> > From this point of view, button.lid_init_state=ignore might be a better choice than
> button.lid_init_state=method to allow libinput to deal with all kind of cases.
> >
> 
> This is my last email on this topic, I don't even want to fully read/answer
> the one in 1/2 given the amount of bad faith you put in that.

What's that?
I mean, the bad faith?

> This is a REGRESSION. It used to work on thousands of devices, it
> doesn't anymore. So any regression has to be chased down and no good
> reason can justify such a regression.

I triggered many such kind of layered regressions and did fix them 1 by 1 in different places.
However, this might be different.
Which depends on our agreement.

> The only solution is to revert both these changes. We can not ask user
> space to fix a kernel regression, it's not how it works.

Yes, I know.
We just asked users to use quirk modes of button driver.
And there is in fact always one of them working.
We haven't asked user space to change.
We are just discussing the correctness of some user space behaviors.

> You can not also change the semantic of an input switch. An input
> switch, as per the input subsystem is supposed to forward an actual
> state of the underlying hardware. Any fake information is bad and has to
> be avoided.

Since fake events are harmful, why do we fake an event after boot/resume?
button.lid_init_state=method seems can fake such an event.

> I already gave you 2 solutions to fix the 7 machines you see that are
> problematic, and you just seem to ignore them:
> - revert to the v4.10 behavior and let libinput fix that for you

I already chose this.
But I just raised a concern that button.lid_init_state=method could bring troubles to libinput quirks.

> - revert to the v4.10 behavior and have a quirk database in acpi/button
> 
> I also proposed to take maintainership on this particular module because
> you said you were assigned this by default because you were the last
> introducing changes in it. I asked you twice, and two times you replied
> skipping this part.

I didn't, you just skipped PATCH 1/2 reply...
Let me copy it from that email:
=====
Actually there is no special maintainership related to the button driver.
All drivers of ACPI spec defined devices (PNPxxx stuffs) are maintained as a whole by Rafael.
And we are helping him by doing triage on kernel Bugzilla.

There isn't a category on kernel Bugzilla related to lid issues.
Probably you can help to create one under ACPI product.
If this can be achieved, you can be the default assignee for such issues.
If this cannot be achieved, we'll just forward some lid reports to you first.
Please tell me which product/component you'd prefer for us to forward.

For modifying acpi/button.c, it's open source, you can send patches and ACPI community can help to review.
=====

> It's clear you don't want to revert to the old state, and even if I can
> prove to you that you have to, you don't listen.
> So please, do not force me to call the maintainers and Linus on this
> simple 2 reverts.

No, I currently cannot persuade myself to revert to "method" mode.
But that doesn't mean I don't want to or I want to.
You just didn't prove that by answering my questions in PATCH 1/2 and in this email.
Especially:
1. After reverting to "method" mode, can libinput work all cases around?
   Are there any timing issues that can prevent libinput from working some sucks around.
   Considering this case, it's likely such problems can happen.
   https://bugzilla.kernel.org/show_bug.cgi?id=106151
2. Who should be responsible for fixing this issue after reverting to "method" mode:
   https://bugzilla.kernel.org/show_bug.cgi?id=106151
   What's the root cause of this issue?
3. How can we generate the quirk for the following possible breakage:
   No lid notification and _LID return cached open after boot/resume

Cheers,
Lv

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-16  8:30                           ` Zheng, Lv
@ 2017-05-16 10:10                             ` Benjamin Tissoires
  2017-05-17  7:32                                 ` Zheng, Lv
  0 siblings, 1 reply; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-16 10:10 UTC (permalink / raw)
  To: Zheng, Lv
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Jiri Eischmann, linux-acpi,
	linux-kernel

On May 16 2017 or thereabouts, Zheng, Lv wrote:
> Hi, Benjamin
> 
> > > > > > >> >> > > > > For example, such a hwdb entry is:
> > > > > > >> >> > > > > libinput:name:*Lid Switch*:dmi:*svnMicrosoftCorporation:pnSurface3:*
> > > > > > >> >> > > > >  LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open
> > > > > > >> >> Well, if it worked in a specific way that users depended on before the commit in
> > > > > > >> >> question and now it works differently, then it does break things.
> > > > > > >> >> Benjamin, my understanding is that this is the case, is it correct?
> > > > > > >> > That is correct. This patch I reverted introduces regression for professional
> > > > > > >> > laptops that expect the LID switch to be reported accurately.
> > > > > > >> And from a user's perspective, what does not work any more?
> > > > > > > If you boot or resume your laptop with the lid closed on a docking
> > > > > > > station while using an external monitor connected to it, both internal
> > > > > > > and external displays will light on, while only the external should.
> > > > > > > There is a design choice in gdm to only provide the greater on the
> > > > > > > internal display when lit on, so users only see a gray area on the
> > > > > > > external monitor. Also, the cursor will not show up as it's by default
> > > > > > > on the internal display too.
> > > > > > > To "fix" that, users have to open the laptop once and close it once
> > > > > > > again to sync the state of the switch with the hardware state.
> > > > > > OK
> > > > > > Yeah, that sucks.
> > > > > > So without the Lv's patch the behavior (on the systems in question) is
> > > > > > as expected, right?
> > > > > Yes, reverting these 2 patches restores the pre v4.11 kernel behavior.
> > > > I would make an argument that:
> > > > A. Is this necessarily a button driver regression?
> > > > 1. Users already configured to not using internal display, why gdm need to determine it again
> > instead
> > > > of users choice?
> > > > 2. Can gdm/graphics driver saves state before suspend, and restores saved state after resume?
> > > >    If users didn't change state during suspend, then everything should be correct.
> > > >    If users changed state during suspend, it should be acceptable for users to change it again to
> > make
> > > > the state correct.
> > > > See, this is obviously a case that is not so strictly related to ACPI button driver.
> > > > Why do we need to force button driver to marry external monitors.
> > > > B. Bug reporters are all ok with using quirk modes as boot parameters to work this around.
> > > > Why should we change our default behavior aimlessly?
> > >
> > > I have one more concern:
> > > In button.lid_init_state=method mode,
> > > Is that possible for libinput to work things around if _LID return value is not correct?
> > > How libinput ensures correct timing of overwriting the input node value?
> > > Will button driver faked event value overwrites what libinput has written?
> > >
> > > From this point of view, button.lid_init_state=ignore might be a better choice than
> > button.lid_init_state=method to allow libinput to deal with all kind of cases.
> > >
> > 
> > This is my last email on this topic, I don't even want to fully read/answer
> > the one in 1/2 given the amount of bad faith you put in that.
> 
> What's that?
> I mean, the bad faith?

I already explained 4 times why we need to revert these two patches and
why we need to keep 'method'. And you keep answering with long emails
that you would rather not. I call it bad faith, sorry.

> 
> > This is a REGRESSION. It used to work on thousands of devices, it
> > doesn't anymore. So any regression has to be chased down and no good
> > reason can justify such a regression.
> 
> I triggered many such kind of layered regressions and did fix them 1 by 1 in different places.
> However, this might be different.

No. It is a regression. It used to work for thousands of devices befor
v4.11, and now it's broken for those devices. It's a regression.

Some new devices are broken with "method", it's a bug, and we can't fix
them by regressing on all the others.

> Which depends on our agreement.
> 
> > The only solution is to revert both these changes. We can not ask user
> > space to fix a kernel regression, it's not how it works.
> 
> Yes, I know.
> We just asked users to use quirk modes of button driver.

I call this "fixing by users", and this is wrong. It used to work for
years for almost everybody, you can not ask users to fix this one by
one.

> And there is in fact always one of them working.

Yes, it's called a quirk. And the good practice is to register those
quirks and make them available to everybody. Being in hwdb in user space
or in acpi/button in kernel space doesn't matter, we need them.

> We haven't asked user space to change.
> We are just discussing the correctness of some user space behaviors.

They *are* correct. They are following the exported ACPI documentation
and the input node documentation. Quoting the input doc:
file Documentation/input/event-codes.rst:
EV_SW
-----

EV_SW events describe stateful binary switches. For example, the SW_LID code is
used to denote when a laptop lid is closed.

Upon binding to a device or resuming from suspend, a driver must report
the current switch state. This ensures that the device, kernel, and userspace
state is in sync.

Upon resume, if the switch state is the same as before suspend, then the input
subsystem will filter out the duplicate switch state reports. The driver does
not need to keep the state of the switch at any time.
----

So no, you can't have 'ignore' or 'open' to be the default, because user
space expects the switch to reflect the current state of the hardware. 

> 
> > You can not also change the semantic of an input switch. An input
> > switch, as per the input subsystem is supposed to forward an actual
> > state of the underlying hardware. Any fake information is bad and has to
> > be avoided.
> 
> Since fake events are harmful, why do we fake an event after boot/resume?
> button.lid_init_state=method seems can fake such an event.

We don't fake an event, we are syncing the input switch state with the
hardware.

Faking an event is when you send "switch is open" while you know there
is a possibility it's actually closed.

> 
> > I already gave you 2 solutions to fix the 7 machines you see that are
> > problematic, and you just seem to ignore them:
> > - revert to the v4.10 behavior and let libinput fix that for you
> 
> I already chose this.
> But I just raised a concern that button.lid_init_state=method could bring troubles to libinput quirks.

No. We can do whatever we want in libinput. And we can fix hardware when
it appears. We don't need to have the correct solution for everybody,
ever. libinput is a library and it can be updated, and we can ask users
to upgrade. We can even break the API by bumping the soname. We have
much more liberty that the kernel has, so the libinput implementation
shouldn't be a concern.

> 
> > - revert to the v4.10 behavior and have a quirk database in acpi/button
> > 
> > I also proposed to take maintainership on this particular module because
> > you said you were assigned this by default because you were the last
> > introducing changes in it. I asked you twice, and two times you replied
> > skipping this part.
> 
> I didn't, you just skipped PATCH 1/2 reply...
> Let me copy it from that email:
> =====
> Actually there is no special maintainership related to the button driver.
> All drivers of ACPI spec defined devices (PNPxxx stuffs) are maintained as a whole by Rafael.

I know, but there is the file MAINTAINER where you can add individual
maintainer per file. Rafael will still be the ACPI maintainer, providing
PR to Linus, but I'll just be the other goto-guy when people run
get_maintainer.pl on the acpi/button file.

> And we are helping him by doing triage on kernel Bugzilla.
> 
> There isn't a category on kernel Bugzilla related to lid issues.
> Probably you can help to create one under ACPI product.

That would be good to have such a category (maybe not LID, but input or
button to reflect the file name)

> If this can be achieved, you can be the default assignee for such issues.
> If this cannot be achieved, we'll just forward some lid reports to you first.

Please do.

> Please tell me which product/component you'd prefer for us to forward.

I'd be happy to be redirected anything input related.

> 
> For modifying acpi/button.c, it's open source, you can send patches and ACPI community can help to review.

Which I already did. My concern is more when there is a change I am not
aware of which introduces regressions and which I could have provided a
valid NACK early enough.

> =====
> 
> > It's clear you don't want to revert to the old state, and even if I can
> > prove to you that you have to, you don't listen.
> > So please, do not force me to call the maintainers and Linus on this
> > simple 2 reverts.
> 
> No, I currently cannot persuade myself to revert to "method" mode.
> But that doesn't mean I don't want to or I want to.
> You just didn't prove that by answering my questions in PATCH 1/2 and in this email.
> Especially:
> 1. After reverting to "method" mode, can libinput work all cases around?

If it doesn't, we'll fix it. So yes, it will eventually.

>    Are there any timing issues that can prevent libinput from working some sucks around.
>    Considering this case, it's likely such problems can happen.
>    https://bugzilla.kernel.org/show_bug.cgi?id=106151

logind has a delay between the time it starts and the time it starts
polling the lid switch state. The default value is a little bit too
short on Fedora considering the faulty devices are running small CPUs.
But this can be changed as wish by the user and by systemd. They told us
they took one arbitrary value by default, and we can change it.

We can ask systemd/logind to change part of the behavior for new devices
when there is a bug. But we can't ask them to change the whole design
because there is a regression in the kernel.

> 2. Who should be responsible for fixing this issue after reverting to "method" mode:
>    https://bugzilla.kernel.org/show_bug.cgi?id=106151

libinput will change the value to open given the heuristics we have.

>    What's the root cause of this issue?

Poorly written EC?
It doesn't matter. We know the machine is buggy, we'll just need a
workaround.

> 3. How can we generate the quirk for the following possible breakage:
>    No lid notification and _LID return cached open after boot/resume

"method" will fetch and report the cached _LID value as open, so there
is no breakage. Unless the lid is actually closed and the EC is plain
wrong, but that is something we can also explain to users or fix by an
other mean. But nothing generic will work. For instance, on the Surface
3, the LID notification for open isn't forwarded, so I wrote a specific
driver to get the correct behavior based on a careful analysis of the
DSDT.

Cheers,
Benjamin

> 
> Cheers,
> Lv

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-16 10:10                             ` Benjamin Tissoires
@ 2017-05-17  7:32                                 ` Zheng, Lv
  0 siblings, 0 replies; 38+ messages in thread
From: Zheng, Lv @ 2017-05-17  7:32 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Jiri Eischmann, linux-acpi,
	linux-kernel

Hi, Benjamin

> > What's that?
> > I mean, the bad faith?
> I already explained 4 times why we need to revert these two patches and
> why we need to keep 'method'. And you keep answering with long emails
> that you would rather not. I call it bad faith, sorry.

The 4 times explanations didn't answer my questions.
But that's OK, let's clarify it again.

> > > This is a REGRESSION. It used to work on thousands of devices, it
> > > doesn't anymore. So any regression has to be chased down and no good
> > > reason can justify such a regression.
> > I triggered many such kind of layered regressions and did fix them 1 by 1 in different places.
> > However, this might be different.
> No. It is a regression. It used to work for thousands of devices befor
> v4.11, and now it's broken for those devices. It's a regression.
> Some new devices are broken with "method", it's a bug, and we can't fix
> them by regressing on all the others.
...
> I call this "fixing by users", and this is wrong. It used to work for
> years for almost everybody, you can not ask users to fix this one by
> one.

What about regressions triggered by this commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=23de5d9ef2a4
Before that (year 2007), "ignore" is the default mode.

Other than this, I just had concerns related to fixing things back and forth, but you didn't reply properly.
Again, that's OK, let's just clarify it.

> Yes, it's called a quirk. And the good practice is to register those
> quirks and make them available to everybody. Being in hwdb in user space
> or in acpi/button in kernel space doesn't matter, we need them.

I have no objections but concerns related to the combination of "default mode" and "quirk responsibles".
From my point of view, these are my conclusions:
1. If you want to use libinput to generate quirks, you should use "ignore" rather than "method" mode as default mode;
2. If you want to use button driver to generate quirks, we need "close" mode;
3. If GDM can change or users are ok use command lines, we can remain to use "open" as the default behavior.
(I'll send technical details in private about these conclusions)
But you seem to always:
1. Say no to "ignore" which makes 1 impossible;
2. Say no to "close" which makes 2 impossible;
3. Say no to "open" which makes 3 impossible.

> > We haven't asked user space to change.
> > We are just discussing the correctness of some user space behaviors.
> 
> They *are* correct.
> They are following the exported ACPI documentation

I doubt. In ACPI world, Windows is the only standard.

> and the input node documentation.
> Quoting the input doc:
> file Documentation/input/event-codes.rst:
> EV_SW
> -----
> 
> EV_SW events describe stateful binary switches. For example, the SW_LID code is
> used to denote when a laptop lid is closed.
> 
> Upon binding to a device or resuming from suspend, a driver must report
> the current switch state. This ensures that the device, kernel, and userspace
> state is in sync.
> 
> Upon resume, if the switch state is the same as before suspend, then the input
> subsystem will filter out the duplicate switch state reports. The driver does
> not need to keep the state of the switch at any time.
> ----

That's really a convenient feature for driver.
If I'm the driver writers, I would be very appreciated for being able to use such features.
So you see I don't have objections to having this feature.

I just have concerns related to:
1. Is it required to have a timeout in systemd, forcing platform to suspend again, just due to event delays?
2. Is it required to use SW_LID to determine whether an internal display should be lit on?
I don't see any conflicts between the ABI of EV_SW and the 2 questions.

> So no, you can't have 'ignore' or 'open' to be the default, because user
> space expects the switch to reflect the current state of the hardware.

Then what's the benefit of having 'method' to be the default,
Given it is still not able to reliably deliver the current state of hardware?
Actually both 'ignore/open/method' modes are trying to be compliant to EV_SW.
Among them, "ignore" did the best IMO.
And cases broken in "ignore" mode but not broken in "method" mode are all issues:
 - Platform doesn't send notification after boot/resume.
   IMO, we should also collect them and indicate them to desktop managers.

So in the end, we just have differences related to picking which default mode.

> > > You can not also change the semantic of an input switch. An input
> > > switch, as per the input subsystem is supposed to forward an actual
> > > state of the underlying hardware. Any fake information is bad and has to
> > > be avoided.
> > Since fake events are harmful, why do we fake an event after boot/resume?
> > button.lid_init_state=method seems can fake such an event.
> We don't fake an event, we are syncing the input switch state with the
> hardware.
> Faking an event is when you send "switch is open" while you know there
> is a possibility it's actually closed.

No we are faking events in this mode.
_LID can return cached state, and:
1. it might be "close" while LID is "open" after suspend.
2. it might be "open" while LID is "close" after suspend.
In this case, it explains the side effect of having type 1 fake event:
https://bugzilla.kernel.org/show_bug.cgi?id=106151
If we don't evaluate _LID and send such a "close", the platform can correctly send "open" just with a delay.

> > > I already gave you 2 solutions to fix the 7 machines you see that are
> > > problematic, and you just seem to ignore them:
> > > - revert to the v4.10 behavior and let libinput fix that for you
> >
> > I already chose this.
> > But I just raised a concern that button.lid_init_state=method could bring troubles to libinput
> quirks.
> 
> No. We can do whatever we want in libinput. And we can fix hardware when
> it appears. We don't need to have the correct solution for everybody,
> ever. libinput is a library and it can be updated, and we can ask users
> to upgrade. We can even break the API by bumping the soname. We have
> much more liberty that the kernel has, so the libinput implementation
> shouldn't be a concern.

It seems you don't know all the details I was looking at.
It's about a timing problem, with button.lid_init_state=method and libinput quirk, we actually have 2 quirks.
And libinput write can appear before/after the faked init notification triggered by "method" mode.
Causing some cases not workaroundable.
See below for details.

> > No, I currently cannot persuade myself to revert to "method" mode.
> > But that doesn't mean I don't want to or I want to.
> > You just didn't prove that by answering my questions in PATCH 1/2 and in this email.
> > Especially:
> > 1. After reverting to "method" mode, can libinput work all cases around?
> If it doesn't, we'll fix it. So yes, it will eventually.

That's great.

> >    Are there any timing issues that can prevent libinput from working some sucks around.
> >    Considering this case, it's likely such problems can happen.
> >    https://bugzilla.kernel.org/show_bug.cgi?id=106151
> 
> logind has a delay between the time it starts and the time it starts
> polling the lid switch state. The default value is a little bit too
> short on Fedora considering the faulty devices are running small CPUs.
> But this can be changed as wish by the user and by systemd. They told us
> they took one arbitrary value by default, and we can change it.
> 
> We can ask systemd/logind to change part of the behavior for new devices
> when there is a bug. But we can't ask them to change the whole design
> because there is a regression in the kernel.

That sounds great!
If systemd/logind can be changed, can we ask systemd/logind to change it to be longer enough.
For example, allowing users to configure this to "INFINITE"?
That solves many other problems.

NOTE, EV_SW is a good feature to reduce duplicated events,
but that doesn't mean users of EV_SW need to be so strict to the timings of SW_LID, right?

If it can be configured to "INFINITE", with "ignore" mode,
systemd/logind should always be able to see an "open" event before seeing a new BIOS triggered "close" events.

So why do you refuse the possibilities of using "open/ignore" as default modes?

> > 2. Who should be responsible for fixing this issue after reverting to "method" mode:
> >    https://bugzilla.kernel.org/show_bug.cgi?id=106151
> 
> libinput will change the value to open given the heuristics we have.

Yes, I see.
But I also can see one broken case cannot be worked around by libinput if you insist to use "method".

> >    What's the root cause of this issue?
> 
> Poorly written EC?
> It doesn't matter. We know the machine is buggy, we'll just need a
> workaround.

It seems you know better than the SAMSUNG official supporters?
The 10-20 seconds lag can be seen even on Windows:
http://www.samsung.com/uk/support/model/NP-N210-JP02UK

Since you seem to be able to see something I'm not aware of.
Let me ask, which do you mean by using word "poorly", EC driver or EC firmware?
If it's EC driver, can you fix it?

> > 3. How can we generate the quirk for the following possible breakage:
> >    No lid notification and _LID return cached open after boot/resume
> 
> "method" will fetch and report the cached _LID value as open, so there
> is no breakage. Unless the lid is actually closed and the EC is plain
> wrong, but that is something we can also explain to users or fix by an
> other mean. But nothing generic will work. For instance, on the Surface
> 3, the LID notification for open isn't forwarded, so I wrote a specific
> driver to get the correct behavior based on a careful analysis of the
> DSDT.

First, I couldn't see anything related to EC here but you kept on talking EC.
Do you have personal feelings against EC?

In fact, this is just a theoretical issue showing that something cannot be worked around by libinput when "method" is the default mode.
I don't even know any platform acting in this way.
But it is likely there are such platforms as the default "method" mode may cause them invisible to us.

The broken cases related to the timing are:
1. If libinput writes "close" after button driver sends initial notification using _LID return value:
   For a platform that has no lid notification and _LID return cached "open" after boot/resume.
   If such a system connects to an external monitor, and users configure to use external display, then
   "open" will arrive user space programs first, thus internal monitor will be lit on and external one will be lit off.
   Which is not what users want.
2. If libinput writes "close" before button driver sends initial notification using _LID return value:
   For a platform that has no lid notification and _LID return cached "close" after boot/resume.
   If such a system connects to an external monitor, and users configure to use internal display, then
   "close" will arrive userspace programs first, thus internal monitor will be lit off and external one will be lit on.
   Which is not what users want.
   There are even more cases broken if libinput writes "close" before button driver sends initial notification using _LID return value.
So either 1 or 2 will be broken.
What I supposed was 1 would be broken thus I listed such theoretical platform.

Cheers,
Lv

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
@ 2017-05-17  7:32                                 ` Zheng, Lv
  0 siblings, 0 replies; 38+ messages in thread
From: Zheng, Lv @ 2017-05-17  7:32 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Jiri Eischmann, linux-acpi,
	linux-kernel

Hi, Benjamin

> > What's that?
> > I mean, the bad faith?
> I already explained 4 times why we need to revert these two patches and
> why we need to keep 'method'. And you keep answering with long emails
> that you would rather not. I call it bad faith, sorry.

The 4 times explanations didn't answer my questions.
But that's OK, let's clarify it again.

> > > This is a REGRESSION. It used to work on thousands of devices, it
> > > doesn't anymore. So any regression has to be chased down and no good
> > > reason can justify such a regression.
> > I triggered many such kind of layered regressions and did fix them 1 by 1 in different places.
> > However, this might be different.
> No. It is a regression. It used to work for thousands of devices befor
> v4.11, and now it's broken for those devices. It's a regression.
> Some new devices are broken with "method", it's a bug, and we can't fix
> them by regressing on all the others.
...
> I call this "fixing by users", and this is wrong. It used to work for
> years for almost everybody, you can not ask users to fix this one by
> one.

What about regressions triggered by this commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=23de5d9ef2a4
Before that (year 2007), "ignore" is the default mode.

Other than this, I just had concerns related to fixing things back and forth, but you didn't reply properly.
Again, that's OK, let's just clarify it.

> Yes, it's called a quirk. And the good practice is to register those
> quirks and make them available to everybody. Being in hwdb in user space
> or in acpi/button in kernel space doesn't matter, we need them.

I have no objections but concerns related to the combination of "default mode" and "quirk responsibles".
>From my point of view, these are my conclusions:
1. If you want to use libinput to generate quirks, you should use "ignore" rather than "method" mode as default mode;
2. If you want to use button driver to generate quirks, we need "close" mode;
3. If GDM can change or users are ok use command lines, we can remain to use "open" as the default behavior.
(I'll send technical details in private about these conclusions)
But you seem to always:
1. Say no to "ignore" which makes 1 impossible;
2. Say no to "close" which makes 2 impossible;
3. Say no to "open" which makes 3 impossible.

> > We haven't asked user space to change.
> > We are just discussing the correctness of some user space behaviors.
> 
> They *are* correct.
> They are following the exported ACPI documentation

I doubt. In ACPI world, Windows is the only standard.

> and the input node documentation.
> Quoting the input doc:
> file Documentation/input/event-codes.rst:
> EV_SW
> -----
> 
> EV_SW events describe stateful binary switches. For example, the SW_LID code is
> used to denote when a laptop lid is closed.
> 
> Upon binding to a device or resuming from suspend, a driver must report
> the current switch state. This ensures that the device, kernel, and userspace
> state is in sync.
> 
> Upon resume, if the switch state is the same as before suspend, then the input
> subsystem will filter out the duplicate switch state reports. The driver does
> not need to keep the state of the switch at any time.
> ----

That's really a convenient feature for driver.
If I'm the driver writers, I would be very appreciated for being able to use such features.
So you see I don't have objections to having this feature.

I just have concerns related to:
1. Is it required to have a timeout in systemd, forcing platform to suspend again, just due to event delays?
2. Is it required to use SW_LID to determine whether an internal display should be lit on?
I don't see any conflicts between the ABI of EV_SW and the 2 questions.

> So no, you can't have 'ignore' or 'open' to be the default, because user
> space expects the switch to reflect the current state of the hardware.

Then what's the benefit of having 'method' to be the default,
Given it is still not able to reliably deliver the current state of hardware?
Actually both 'ignore/open/method' modes are trying to be compliant to EV_SW.
Among them, "ignore" did the best IMO.
And cases broken in "ignore" mode but not broken in "method" mode are all issues:
 - Platform doesn't send notification after boot/resume.
   IMO, we should also collect them and indicate them to desktop managers.

So in the end, we just have differences related to picking which default mode.

> > > You can not also change the semantic of an input switch. An input
> > > switch, as per the input subsystem is supposed to forward an actual
> > > state of the underlying hardware. Any fake information is bad and has to
> > > be avoided.
> > Since fake events are harmful, why do we fake an event after boot/resume?
> > button.lid_init_state=method seems can fake such an event.
> We don't fake an event, we are syncing the input switch state with the
> hardware.
> Faking an event is when you send "switch is open" while you know there
> is a possibility it's actually closed.

No we are faking events in this mode.
_LID can return cached state, and:
1. it might be "close" while LID is "open" after suspend.
2. it might be "open" while LID is "close" after suspend.
In this case, it explains the side effect of having type 1 fake event:
https://bugzilla.kernel.org/show_bug.cgi?id=106151
If we don't evaluate _LID and send such a "close", the platform can correctly send "open" just with a delay.

> > > I already gave you 2 solutions to fix the 7 machines you see that are
> > > problematic, and you just seem to ignore them:
> > > - revert to the v4.10 behavior and let libinput fix that for you
> >
> > I already chose this.
> > But I just raised a concern that button.lid_init_state=method could bring troubles to libinput
> quirks.
> 
> No. We can do whatever we want in libinput. And we can fix hardware when
> it appears. We don't need to have the correct solution for everybody,
> ever. libinput is a library and it can be updated, and we can ask users
> to upgrade. We can even break the API by bumping the soname. We have
> much more liberty that the kernel has, so the libinput implementation
> shouldn't be a concern.

It seems you don't know all the details I was looking at.
It's about a timing problem, with button.lid_init_state=method and libinput quirk, we actually have 2 quirks.
And libinput write can appear before/after the faked init notification triggered by "method" mode.
Causing some cases not workaroundable.
See below for details.

> > No, I currently cannot persuade myself to revert to "method" mode.
> > But that doesn't mean I don't want to or I want to.
> > You just didn't prove that by answering my questions in PATCH 1/2 and in this email.
> > Especially:
> > 1. After reverting to "method" mode, can libinput work all cases around?
> If it doesn't, we'll fix it. So yes, it will eventually.

That's great.

> >    Are there any timing issues that can prevent libinput from working some sucks around.
> >    Considering this case, it's likely such problems can happen.
> >    https://bugzilla.kernel.org/show_bug.cgi?id=106151
> 
> logind has a delay between the time it starts and the time it starts
> polling the lid switch state. The default value is a little bit too
> short on Fedora considering the faulty devices are running small CPUs.
> But this can be changed as wish by the user and by systemd. They told us
> they took one arbitrary value by default, and we can change it.
> 
> We can ask systemd/logind to change part of the behavior for new devices
> when there is a bug. But we can't ask them to change the whole design
> because there is a regression in the kernel.

That sounds great!
If systemd/logind can be changed, can we ask systemd/logind to change it to be longer enough.
For example, allowing users to configure this to "INFINITE"?
That solves many other problems.

NOTE, EV_SW is a good feature to reduce duplicated events,
but that doesn't mean users of EV_SW need to be so strict to the timings of SW_LID, right?

If it can be configured to "INFINITE", with "ignore" mode,
systemd/logind should always be able to see an "open" event before seeing a new BIOS triggered "close" events.

So why do you refuse the possibilities of using "open/ignore" as default modes?

> > 2. Who should be responsible for fixing this issue after reverting to "method" mode:
> >    https://bugzilla.kernel.org/show_bug.cgi?id=106151
> 
> libinput will change the value to open given the heuristics we have.

Yes, I see.
But I also can see one broken case cannot be worked around by libinput if you insist to use "method".

> >    What's the root cause of this issue?
> 
> Poorly written EC?
> It doesn't matter. We know the machine is buggy, we'll just need a
> workaround.

It seems you know better than the SAMSUNG official supporters?
The 10-20 seconds lag can be seen even on Windows:
http://www.samsung.com/uk/support/model/NP-N210-JP02UK

Since you seem to be able to see something I'm not aware of.
Let me ask, which do you mean by using word "poorly", EC driver or EC firmware?
If it's EC driver, can you fix it?

> > 3. How can we generate the quirk for the following possible breakage:
> >    No lid notification and _LID return cached open after boot/resume
> 
> "method" will fetch and report the cached _LID value as open, so there
> is no breakage. Unless the lid is actually closed and the EC is plain
> wrong, but that is something we can also explain to users or fix by an
> other mean. But nothing generic will work. For instance, on the Surface
> 3, the LID notification for open isn't forwarded, so I wrote a specific
> driver to get the correct behavior based on a careful analysis of the
> DSDT.

First, I couldn't see anything related to EC here but you kept on talking EC.
Do you have personal feelings against EC?

In fact, this is just a theoretical issue showing that something cannot be worked around by libinput when "method" is the default mode.
I don't even know any platform acting in this way.
But it is likely there are such platforms as the default "method" mode may cause them invisible to us.

The broken cases related to the timing are:
1. If libinput writes "close" after button driver sends initial notification using _LID return value:
   For a platform that has no lid notification and _LID return cached "open" after boot/resume.
   If such a system connects to an external monitor, and users configure to use external display, then
   "open" will arrive user space programs first, thus internal monitor will be lit on and external one will be lit off.
   Which is not what users want.
2. If libinput writes "close" before button driver sends initial notification using _LID return value:
   For a platform that has no lid notification and _LID return cached "close" after boot/resume.
   If such a system connects to an external monitor, and users configure to use internal display, then
   "close" will arrive userspace programs first, thus internal monitor will be lit off and external one will be lit on.
   Which is not what users want.
   There are even more cases broken if libinput writes "close" before button driver sends initial notification using _LID return value.
So either 1 or 2 will be broken.
What I supposed was 1 would be broken thus I listed such theoretical platform.

Cheers,
Lv

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-17  7:32                                 ` Zheng, Lv
  (?)
@ 2017-05-17 11:54                                 ` Peter Hutterer
  -1 siblings, 0 replies; 38+ messages in thread
From: Peter Hutterer @ 2017-05-17 11:54 UTC (permalink / raw)
  To: Zheng, Lv
  Cc: Benjamin Tissoires, Rafael J. Wysocki, Rafael J. Wysocki,
	Jiri Eischmann, linux-acpi, linux-kernel

Hi Lv

> > Yes, it's called a quirk. And the good practice is to register those
> > quirks and make them available to everybody. Being in hwdb in user space
> > or in acpi/button in kernel space doesn't matter, we need them.
> 
> I have no objections but concerns related to the combination of "default mode" and "quirk responsibles".
> From my point of view, these are my conclusions:
> 1. If you want to use libinput to generate quirks, you should use "ignore"
> rather than "method" mode as default mode;

libinput does not replace the kernel. libinput can help with some
heuristics, but that should be the exception, not the default. And
heuristics cannot detect some cases, ignore suffers from that problem (see
below).

> 2. If you want to use button driver to generate quirks, we need "close" mode;
> 3. If GDM can change or users are ok use command lines, we can remain to use "open" as the default behavior.
> (I'll send technical details in private about these conclusions)
> But you seem to always:
> 1. Say no to "ignore" which makes 1 impossible;

as an outsider, 'ignore' makes me wonder: Why wouldn't you query the state
of the hardware if it lets you? That's what we do in userspace with EV_SW.
We don't just rely on the notification, we look at the state of it too.

sure, some hardware is buggy and we can somehow work around this but that's
not a reason to throw everyone else under the bus too.

> 2. Say no to "close" which makes 2 impossible;

'close' forces userspace to fix up the kernel in every case rather than for
those devices where method doesn't work correctly. That's just effectively
deciding to be always the least wrong just to avoid a few outliers.
(also, if the kernel is always wrong, what purpose does it serve? :)

> 3. Say no to "open" which makes 3 impossible.

as mentioned above, some things we cannot detect.
we cannot detect a false-open with heuristics, this makes it impossible to
fix with heuristics in userspace. for 'close' and 'method' we can take user
input as indication that the lid isn't as closed as it pretends to be. That
doesn't work the other way round, lack of user input does not imply the
lid is closed.

> > > We haven't asked user space to change.
> > > We are just discussing the correctness of some user space behaviors.
> >
> > They *are* correct.
> > They are following the exported ACPI documentation
> 
> I doubt. In ACPI world, Windows is the only standard.

ok, here I need to note something: ACPI doesn't matter to userspace.
the kernel has EV_SW/SW_LID, which is *not* ACPI. that promises that a
userspace can assume that if SW_LID is 1, then the lid is closed,
otherwise it's open. Some leeway can be given because, well, reality, 
but a userspace behaviour to assume a lid is closed when the lid switch is
set to that state should not be up for discussion.
 
> > and the input node documentation.
> > Quoting the input doc:
> > file Documentation/input/event-codes.rst:
> > EV_SW
> > -----
> >
> > EV_SW events describe stateful binary switches. For example, the SW_LID code is
> > used to denote when a laptop lid is closed.
> >
> > Upon binding to a device or resuming from suspend, a driver must report
> > the current switch state. This ensures that the device, kernel, and userspace
> > state is in sync.
> >
> > Upon resume, if the switch state is the same as before suspend, then the input
> > subsystem will filter out the duplicate switch state reports. The driver does
> > not need to keep the state of the switch at any time.
> > ----
> 
> That's really a convenient feature for driver.
> If I'm the driver writers, I would be very appreciated for being able to use such features.
> So you see I don't have objections to having this feature.
> 
> I just have concerns related to:
> 1. Is it required to have a timeout in systemd, forcing platform to suspend again, just due to event delays?
> 2. Is it required to use SW_LID to determine whether an internal display should be lit on?
> I don't see any conflicts between the ABI of EV_SW and the 2 questions.

the ABI of EV_SW? that's just evdev. It seems you're discussing three, four
things simultaneously, ACPI, the button driver, the evdev switch API and
userspace behaviour in response to this whole thread. The simple answer is:
the last bit doesn't matter - if EV_SW/SW_LID is on, the lid can be assumed
to be closed.

How you get to this point is your side of the problem :)
And I'm willing to accommodate for some issues (see the heuristics benjamin
already mentioned) but they should be exception, not the rule.

but whether e.g. displays are lit or not has nothing to do with this
discussion, it just doesn't matter what userspace does with the information.

> > So no, you can't have 'ignore' or 'open' to be the default, because user
> > space expects the switch to reflect the current state of the hardware.
> 
> Then what's the benefit of having 'method' to be the default,
> Given it is still not able to reliably deliver the current state of hardware?

isn't this just a case of: method is less wrong than ignore? 100%
reliability is not possible given the hardware available on the market, but
it seems that 'method' is the least wrong of them, followed by ignore,
trailed by open and close.

> Actually both 'ignore/open/method' modes are trying to be compliant to EV_SW.

"compliant to EV_SW" just means "SW_LID state reflects state of the
hardware", right?

> Among them, "ignore" did the best IMO.

how did ignore do better than method? what devices give you an incorrect
state but send you a correct notification later? And even then it would only
matter for 50% of the cases anyway, because a false state of open first
followed by a close notification later wouldn't matter.

> And cases broken in "ignore" mode but not broken in "method" mode are all issues:
>  - Platform doesn't send notification after boot/resume.
>    IMO, we should also collect them and indicate them to desktop managers.
> 
> So in the end, we just have differences related to picking which default mode.
> 
> > > > You can not also change the semantic of an input switch. An input
> > > > switch, as per the input subsystem is supposed to forward an actual
> > > > state of the underlying hardware. Any fake information is bad and has to
> > > > be avoided.
> > > Since fake events are harmful, why do we fake an event after boot/resume?
> > > button.lid_init_state=method seems can fake such an event.
> > We don't fake an event, we are syncing the input switch state with the
> > hardware.
> > Faking an event is when you send "switch is open" while you know there
> > is a possibility it's actually closed.
> 
> No we are faking events in this mode.
> _LID can return cached state, and:
> 1. it might be "close" while LID is "open" after suspend.
> 2. it might be "open" while LID is "close" after suspend.
> In this case, it explains the side effect of having type 1 fake event:
> https://bugzilla.kernel.org/show_bug.cgi?id=106151
> If we don't evaluate _LID and send such a "close", the platform can
> correctly send "open" just with a delay.

side-note: _LID and LID makes for confusing reading. _LID is ACPI and LID is
SW_LID? 

if acpi/the hw gives you an incorrect cached state, you'll need to quirk
*that* hardware, because it's more broken than others.

> > > > I already gave you 2 solutions to fix the 7 machines you see that are
> > > > problematic, and you just seem to ignore them:
> > > > - revert to the v4.10 behavior and let libinput fix that for you
> > >
> > > I already chose this.
> > > But I just raised a concern that button.lid_init_state=method could bring troubles to libinput
> > quirks.
> >
> > No. We can do whatever we want in libinput. And we can fix hardware when
> > it appears. We don't need to have the correct solution for everybody,
> > ever. libinput is a library and it can be updated, and we can ask users
> > to upgrade. We can even break the API by bumping the soname. We have
> > much more liberty that the kernel has, so the libinput implementation
> > shouldn't be a concern.
> 
> It seems you don't know all the details I was looking at.
> It's about a timing problem, with button.lid_init_state=method and libinput quirk, we actually have 2 quirks.
> And libinput write can appear before/after the faked init notification triggered by "method" mode.
> Causing some cases not workaroundable.
> See below for details.

see below for answers, but basically:
if libinput writes the correct state before the notification arrives,
*nothing* happens in userspace. The kernel filters duplicate events, so even
if libinput sets the state before the kernel driver does, userspace won't
receive an event.
 
> > > No, I currently cannot persuade myself to revert to "method" mode.
> > > But that doesn't mean I don't want to or I want to.
> > > You just didn't prove that by answering my questions in PATCH 1/2 and in this email.
> > > Especially:
> > > 1. After reverting to "method" mode, can libinput work all cases around?
> > If it doesn't, we'll fix it. So yes, it will eventually.
> 
> That's great.
> 
> > >    Are there any timing issues that can prevent libinput from working some sucks around.
> > >    Considering this case, it's likely such problems can happen.
> > >    https://bugzilla.kernel.org/show_bug.cgi?id=106151
> >
> > logind has a delay between the time it starts and the time it starts
> > polling the lid switch state. The default value is a little bit too
> > short on Fedora considering the faulty devices are running small CPUs.
> > But this can be changed as wish by the user and by systemd. They told us
> > they took one arbitrary value by default, and we can change it.
> >
> > We can ask systemd/logind to change part of the behavior for new devices
> > when there is a bug. But we can't ask them to change the whole design
> > because there is a regression in the kernel.
> 
> That sounds great!
> If systemd/logind can be changed, can we ask systemd/logind to change it to be longer enough.
> For example, allowing users to configure this to "INFINITE"?
> That solves many other problems.
> 
> NOTE, EV_SW is a good feature to reduce duplicated events,
> but that doesn't mean users of EV_SW need to be so strict to the timings of SW_LID, right?
> 
> If it can be configured to "INFINITE", with "ignore" mode,
> systemd/logind should always be able to see an "open" event before seeing a new BIOS triggered "close" events.
> 
> So why do you refuse the possibilities of using "open/ignore" as default modes?
> 
> > > 2. Who should be responsible for fixing this issue after reverting to "method" mode:
> > >    https://bugzilla.kernel.org/show_bug.cgi?id=106151
> >
> > libinput will change the value to open given the heuristics we have.
> 
> Yes, I see.
> But I also can see one broken case cannot be worked around by libinput if you insist to use "method".
> 
> > >    What's the root cause of this issue?
> >
> > Poorly written EC?
> > It doesn't matter. We know the machine is buggy, we'll just need a
> > workaround.
> 
> It seems you know better than the SAMSUNG official supporters?
> The 10-20 seconds lag can be seen even on Windows:
> http://www.samsung.com/uk/support/model/NP-N210-JP02UK
> 
> Since you seem to be able to see something I'm not aware of.
> Let me ask, which do you mean by using word "poorly", EC driver or EC firmware?
> If it's EC driver, can you fix it?
> 
> > > 3. How can we generate the quirk for the following possible breakage:
> > >    No lid notification and _LID return cached open after boot/resume
> >
> > "method" will fetch and report the cached _LID value as open, so there
> > is no breakage. Unless the lid is actually closed and the EC is plain
> > wrong, but that is something we can also explain to users or fix by an
> > other mean. But nothing generic will work. For instance, on the Surface
> > 3, the LID notification for open isn't forwarded, so I wrote a specific
> > driver to get the correct behavior based on a careful analysis of the
> > DSDT.
> 
> First, I couldn't see anything related to EC here but you kept on talking EC.
> Do you have personal feelings against EC?

let's not go down that type of discussion please...

> In fact, this is just a theoretical issue showing that something cannot be
> worked around by libinput when "method" is the default mode.
> I don't even know any platform acting in this way.
> But it is likely there are such platforms as the default "method" mode may
> cause them invisible to us.

note: it may be likely, but it's a *definitive* that existing platforms
break with the current 'open'...

> The broken cases related to the timing are:
> 1. If libinput writes "close" after button driver sends initial notification using _LID return value:

This is a theoretical case only. libinput does not write close, ever.
because there is no heuristics we can use that can guarantee that the lid is
closed (well, short of enabling the camera and checking if it looks really
dark, but let's not do that ;)

>    For a platform that has no lid notification and _LID return cached "open" after boot/resume.
>    If such a system connects to an external monitor, and users configure to use external display, then
>    "open" will arrive user space programs first, thus internal monitor will be lit on and external one will be lit off.
>    Which is not what users want.
> 2. If libinput writes "close" before button driver sends initial notification using _LID return value:

see 1, libinput does not write 'close'.

also, just in case that's not obvious yet: libinput behaviour is a) internal
to libinput and b) tailored to the hardware. The current behaviour works for
the devices we've needed to fix. We can change that at any time and we can
add other behaviours at any time. So even if some theoretical platforms
appear that do the wrong thing - we can deal with it then and figure out.

note also, that the only reason libinput even writes the switch event to the
event node is because we are nice to others. We have that information (user
is typing, therefore lid is not closed) so we share it with anyone else by
changing the device's state. This way other processes can pick it up,
without needing the same heuristics.

Cheers,
   Peter

>    For a platform that has no lid notification and _LID return cached "close" after boot/resume.
>    If such a system connects to an external monitor, and users configure to use internal display, then
>    "close" will arrive userspace programs first, thus internal monitor will be lit off and external one will be lit on.
>    Which is not what users want.
>    There are even more cases broken if libinput writes "close" before button driver sends initial notification using _LID return value.
> So either 1 or 2 will be broken.
> What I supposed was 1 would be broken thus I listed such theoretical platform.
> 
> Cheers,
> Lv

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-17  7:32                                 ` Zheng, Lv
  (?)
  (?)
@ 2017-05-17 14:16                                 ` Benjamin Tissoires
  -1 siblings, 0 replies; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-17 14:16 UTC (permalink / raw)
  To: Zheng, Lv
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Jiri Eischmann,
	Peter Hutterer, linux-acpi, linux-kernel

Hi Lv,

[thank you Peter for jumping in the thread]

Just a few precisions regarding questions you asked:

On May 17 2017 or thereabouts, Zheng, Lv wrote:
[stripped]
> > [User space tools] *are* correct.
> > They are following the exported ACPI documentation
> 
> I doubt. In ACPI world, Windows is the only standard.

I was talking here about the i915 driver that uses the Linux ACPI
documentation in regard to the call of acpi_lid_open().
User space tool shouldn't rely on the _LID call through the sysfs if
acpi/button.c actaully forwards the ACPI call, because it is clear now
that it is not reliable in all situations.

> 
> > and the input node documentation.
> > Quoting the input doc:
> > file Documentation/input/event-codes.rst:
> > EV_SW
> > -----
> > 
> > EV_SW events describe stateful binary switches. For example, the SW_LID code is
> > used to denote when a laptop lid is closed.
> > 
> > Upon binding to a device or resuming from suspend, a driver must report
> > the current switch state. This ensures that the device, kernel, and userspace
> > state is in sync.
> > 
> > Upon resume, if the switch state is the same as before suspend, then the input
> > subsystem will filter out the duplicate switch state reports. The driver does
> > not need to keep the state of the switch at any time.
> > ----
> 
> That's really a convenient feature for driver.
> If I'm the driver writers, I would be very appreciated for being able to use such features.
> So you see I don't have objections to having this feature.

It's not a feature on how the system could be improved, it's the
definition of an input node that relay an EV_SW element. acpi/button.c
exports a LID_SW, so it has to conform to the definition. That is not
something we should be discussing.

> 
> I just have concerns related to:
> 1. Is it required to have a timeout in systemd, forcing platform to suspend again, just due to event delays?
> 2. Is it required to use SW_LID to determine whether an internal display should be lit on?
> I don't see any conflicts between the ABI of EV_SW and the 2 questions.

As Peter mentioned, the users of the switch are doing what they want.
But if you don't have the LID_SW input node, you can not know the state
of the laptop (opened or closed). So yes, it's there for a reason.

[stripped]

> > No. We can do whatever we want in libinput. And we can fix hardware when
> > it appears. We don't need to have the correct solution for everybody,
> > ever. libinput is a library and it can be updated, and we can ask users
> > to upgrade. We can even break the API by bumping the soname. We have
> > much more liberty that the kernel has, so the libinput implementation
> > shouldn't be a concern.
> 
> It seems you don't know all the details I was looking at.
> It's about a timing problem, with button.lid_init_state=method and libinput quirk, we actually have 2 quirks.
> And libinput write can appear before/after the faked init notification triggered by "method" mode.

See Peter's answer (nothing will happen in user-space), but I just
wanted to raised the fact that the "method" parameter doesn't trigger
"faked init notifications". The "method" parameter only syncs the state
with the hardware. And given this happens in the .resume() callback,
it means that users will have to wait for the resume to finish before
taking any action. So it's on their side that there is a problem if they
are reading the state before the kernel even had the chance to update
it.

[stripped]

> That sounds great!
> If systemd/logind can be changed, can we ask systemd/logind to change it to be longer enough.
> For example, allowing users to configure this to "INFINITE"?

I think they have a good reason but I never fully understood why they
have to poll on the state.

> That solves many other problems.
> 
> NOTE, EV_SW is a good feature to reduce duplicated events,
> but that doesn't mean users of EV_SW need to be so strict to the timings of SW_LID, right?

When an input event happens over evdev, it's expected that the user
treats it immediately.
In the LID_SW case we can somewhat tell user space that they need to be
extra careful, but they can argue that we should comply to the
definition of an input switch, or not exporting it at all.

[stripped]

> 
> > >    What's the root cause of this issue?
> > 
> > Poorly written EC?
> > It doesn't matter. We know the machine is buggy, we'll just need a
> > workaround.
> 
> It seems you know better than the SAMSUNG official supporters?

I wouldn't pretend to know more than Samsung engineers. All I see is
that there is an issue in the notification and/or the acpi _LID call, so
all I can say is that there is a bug in the firmware of the embedded
controller in charge of answering acpi calls.

> The 10-20 seconds lag can be seen even on Windows:
> http://www.samsung.com/uk/support/model/NP-N210-JP02UK
> 
> Since you seem to be able to see something I'm not aware of.
> Let me ask, which do you mean by using word "poorly", EC driver or EC firmware?

'Driver' implies in the kernel, and I never intended that there was such
a bug. If there is, then yes, I'd expect the people in charge of it to
fix it. But given the bug also happens in Windows, it's more likely that
it's a firmware issue, and it's the responsibility of Samsung to fix it.

I thought using EC was more precise that when talking about 'ACPI'
because ACPI can also be the project to support it under Linux, not the
FW itself.

> If it's EC driver, can you fix it?

I can't say I can fix anything. I don't have the machine and don't have
much time to dedicate to it. So no I guess.

> 
> > > 3. How can we generate the quirk for the following possible breakage:
> > >    No lid notification and _LID return cached open after boot/resume
> > 
> > "method" will fetch and report the cached _LID value as open, so there
> > is no breakage. Unless the lid is actually closed and the EC is plain
> > wrong, but that is something we can also explain to users or fix by an
> > other mean. But nothing generic will work. For instance, on the Surface
> > 3, the LID notification for open isn't forwarded, so I wrote a specific
> > driver to get the correct behavior based on a careful analysis of the
> > DSDT.
> 
> First, I couldn't see anything related to EC here but you kept on talking EC.
> Do you have personal feelings against EC?

For me EC == Embedded Controller == Firmware written in it that we both not
control (unless Intel writes part of it, which I am not aware of).
Sorry if EC means something else for you.

Cheers,
Benjamin

> 
> In fact, this is just a theoretical issue showing that something cannot be worked around by libinput when "method" is the default mode.
> I don't even know any platform acting in this way.
> But it is likely there are such platforms as the default "method" mode may cause them invisible to us.
> 
> The broken cases related to the timing are:
> 1. If libinput writes "close" after button driver sends initial notification using _LID return value:
>    For a platform that has no lid notification and _LID return cached "open" after boot/resume.
>    If such a system connects to an external monitor, and users configure to use external display, then
>    "open" will arrive user space programs first, thus internal monitor will be lit on and external one will be lit off.
>    Which is not what users want.
> 2. If libinput writes "close" before button driver sends initial notification using _LID return value:
>    For a platform that has no lid notification and _LID return cached "close" after boot/resume.
>    If such a system connects to an external monitor, and users configure to use internal display, then
>    "close" will arrive userspace programs first, thus internal monitor will be lit off and external one will be lit on.
>    Which is not what users want.
>    There are even more cases broken if libinput writes "close" before button driver sends initial notification using _LID return value.
> So either 1 or 2 will be broken.
> What I supposed was 1 would be broken thus I listed such theoretical platform.
> 
> Cheers,
> Lv

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-15 11:01                 ` Rafael J. Wysocki
  2017-05-15 13:05                   ` Benjamin Tissoires
@ 2017-05-24  8:08                   ` Benjamin Tissoires
  2017-05-24 23:01                     ` Rafael J. Wysocki
  2017-05-26  7:39                     ` Peter Hutterer
  1 sibling, 2 replies; 38+ messages in thread
From: Benjamin Tissoires @ 2017-05-24  8:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Zheng, Lv, Jiri Eischmann, linux-acpi, linux-kernel

Hi Rafael,

On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
> >> >> Benjamin, my understanding is that this is the case, is it correct?
> >> >
> >> > That is correct. This patch I reverted introduces regression for professional
> >> > laptops that expect the LID switch to be reported accurately.
> >>
> >> And from a user's perspective, what does not work any more?
> >
> > If you boot or resume your laptop with the lid closed on a docking
> > station while using an external monitor connected to it, both internal
> > and external displays will light on, while only the external should.
> >
> > There is a design choice in gdm to only provide the greater on the
> > internal display when lit on, so users only see a gray area on the
> > external monitor. Also, the cursor will not show up as it's by default
> > on the internal display too.
> >
> > To "fix" that, users have to open the laptop once and close it once
> > again to sync the state of the switch with the hardware state.
> 
> OK
> 
> Yeah, that sucks.
> 
> So without the Lv's patch the behavior (on the systems in question) is
> as expected, right?

Would you agree to take both these reverts without Lv's ACK? We already
tried to explain for 2 weeks that they are valuable, but it seems we
can't make change his mind.

I have more that 26 emails in my INBOX (not counting my replies) and I
would really like switching to more valuable work than explaining again
and again that when a regression is introduced, it needs to be fixed (or
reverted in that case).

Cheers,
Benjamin

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-24  8:08                   ` Benjamin Tissoires
@ 2017-05-24 23:01                     ` Rafael J. Wysocki
  2017-05-25  6:17                       ` Zheng, Lv
  2017-05-26  7:39                     ` Peter Hutterer
  1 sibling, 1 reply; 38+ messages in thread
From: Rafael J. Wysocki @ 2017-05-24 23:01 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Zheng, Lv, Jiri Eischmann,
	linux-acpi, linux-kernel

On Wed, May 24, 2017 at 10:08 AM, Benjamin Tissoires
<benjamin.tissoires@redhat.com> wrote:
> Hi Rafael,
>
> On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
>> >> >> Benjamin, my understanding is that this is the case, is it correct?
>> >> >
>> >> > That is correct. This patch I reverted introduces regression for professional
>> >> > laptops that expect the LID switch to be reported accurately.
>> >>
>> >> And from a user's perspective, what does not work any more?
>> >
>> > If you boot or resume your laptop with the lid closed on a docking
>> > station while using an external monitor connected to it, both internal
>> > and external displays will light on, while only the external should.
>> >
>> > There is a design choice in gdm to only provide the greater on the
>> > internal display when lit on, so users only see a gray area on the
>> > external monitor. Also, the cursor will not show up as it's by default
>> > on the internal display too.
>> >
>> > To "fix" that, users have to open the laptop once and close it once
>> > again to sync the state of the switch with the hardware state.
>>
>> OK
>>
>> Yeah, that sucks.
>>
>> So without the Lv's patch the behavior (on the systems in question) is
>> as expected, right?
>
> Would you agree to take both these reverts without Lv's ACK? We already
> tried to explain for 2 weeks that they are valuable, but it seems we
> can't make change his mind.

One of the reverts actually is already in (as a patch from Lv) and
I'll most probably push the other one for -rc4 next week.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-24 23:01                     ` Rafael J. Wysocki
@ 2017-05-25  6:17                       ` Zheng, Lv
  0 siblings, 0 replies; 38+ messages in thread
From: Zheng, Lv @ 2017-05-25  6:17 UTC (permalink / raw)
  To: Rafael J. Wysocki, Benjamin Tissoires
  Cc: Rafael J. Wysocki, Jiri Eischmann, linux-acpi, linux-kernel

Hi,

> >> >> >> Benjamin, my understanding is that this is the case, is it correct?
> >> >> >
> >> >> > That is correct. This patch I reverted introduces regression for professional
> >> >> > laptops that expect the LID switch to be reported accurately.
> >> >>
> >> >> And from a user's perspective, what does not work any more?
> >> >
> >> > If you boot or resume your laptop with the lid closed on a docking
> >> > station while using an external monitor connected to it, both internal
> >> > and external displays will light on, while only the external should.
> >> >
> >> > There is a design choice in gdm to only provide the greater on the
> >> > internal display when lit on, so users only see a gray area on the
> >> > external monitor. Also, the cursor will not show up as it's by default
> >> > on the internal display too.
> >> >
> >> > To "fix" that, users have to open the laptop once and close it once
> >> > again to sync the state of the switch with the hardware state.
> >>
> >> OK
> >>
> >> Yeah, that sucks.
> >>
> >> So without the Lv's patch the behavior (on the systems in question) is
> >> as expected, right?
> >
> > Would you agree to take both these reverts without Lv's ACK? We already
> > tried to explain for 2 weeks that they are valuable, but it seems we
> > can't make change his mind.

It's not that difficult to get an agreement.
We just didn't communicate well.

> One of the reverts actually is already in (as a patch from Lv) and
> I'll most probably push the other one for -rc4 next week.

If we really want to go back to "method" mode.
We need one more patch and it is not in Benjamin's series.

There are 3 known broken cases, 2 of them are related to orders.
The last 1 is not order related:
1. Surface Pro 3: open arrives very early to update cached value, not notification
2. Samsung N210+: open arrives very late to update cached value, not notification 
3. Surface Pro 1: no open event, _LID keeps on returning close

The order problem is (considering method mode):

_Qxx <- Invoked due to EC events
  Update _LID return value
  Notify(LID, close)
    input_report(SW_LID 1) -> captured by user space and system starts to suspend
acpi_button_suspend
acpi_ec_suspend
  acpi_ec_disable_event
acpi_button_resume
  if (method)
    input_report(SW_LID, _LID return value, would be 1 for cached value)
acpi_ec_resume
  acpi_ec_enable_event
_Qxx <- Invoked due to EC events, for broken case 3, no such event
  Update _LID return value
  Notify(LID, open) <- for broken case 1, 2, 3, no such notification, thus open cannot be delivered to user space.
    input_report(SW_LID, 0)

The order of acpi_button_resume()/acpi_ec_resume() is determined by the enumeration order.
So it could vary on different platforms.
Considering case 1, for surface pro 3, where acpi_button_resume() is invoked before acpi_ec_resume().
Button driver will send false "close" to user space, and the updated "open" state won't be delivered to user space.
Staying in method mode, we can only suspend the system once, follow-up "close" events won't arrive to user space.

Even we can add many workarounds to make sure acpi_ec_resume() is executed before acpi_button_resume() on such platforms.
We still cannot fix case 2 and case 3.
So finally this order still cannot be ensured, and the solution is still not stable.
I would imagine the order problem is the key reason why MS stops sending "open" on these platforms.

Then given this order issue is not fixable, we need to fix broken case 1 by adding "complement event" in "method" mode.
Why don't we do this first before reverting to "method" mode?

And for broken stuffs like case 2/case 3,
IMO, they can only be solved using "ignore" mode, and changes in user space.
If we don't use this solution, we still need libinput quirks to handle them (may possibly encounter some new unfixable problems).
If you want to stay in "method" mode, will you be responsible for responding end users on kernel Bugzilla using libinput quirks?
We have a Power-Lid category now, we can have you guys as default assignee.

Cheers,
Lv

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-24  8:08                   ` Benjamin Tissoires
  2017-05-24 23:01                     ` Rafael J. Wysocki
@ 2017-05-26  7:39                     ` Peter Hutterer
  2017-05-27 19:23                       ` Rafael J. Wysocki
  1 sibling, 1 reply; 38+ messages in thread
From: Peter Hutterer @ 2017-05-26  7:39 UTC (permalink / raw)
  To: Benjamin Tissoires
  Cc: Rafael J. Wysocki, Zheng, Lv, Jiri Eischmann, linux-acpi, linux-kernel

On May 25 2017, Benjamin Tissoires wrote:
> On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:

> > >> >> Benjamin, my understanding is that this is the case, is it correct?
> > >> >
> > >> > That is correct. This patch I reverted introduces regression for professional
> > >> > laptops that expect the LID switch to be reported accurately.
> > >>
> > >> And from a user's perspective, what does not work any more?
> > >
> > > If you boot or resume your laptop with the lid closed on a docking
> > > station while using an external monitor connected to it, both internal
> > > and external displays will light on, while only the external should.
> > >
> > > There is a design choice in gdm to only provide the greater on the
> > > internal display when lit on, so users only see a gray area on the
> > > external monitor. Also, the cursor will not show up as it's by default
> > > on the internal display too.
> > >
> > > To "fix" that, users have to open the laptop once and close it once
> > > again to sync the state of the switch with the hardware state.
> >
> > OK
> >
> > Yeah, that sucks.
> >
> > So without the Lv's patch the behavior (on the systems in question) is
> > as expected, right?
> 
> Would you agree to take both these reverts without Lv's ACK? We already
> tried to explain for 2 weeks that they are valuable, but it seems we
> can't make change his mind.
> 
> I have more that 26 emails in my INBOX (not counting my replies) and I
> would really like switching to more valuable work than explaining again
> and again that when a regression is introduced, it needs to be fixed (or
> reverted in that case).

Yes please. This should have stopped right after "regression on basically
every decent laptop out there" and we should be discussing how to fix the
devices that actually need quirks because they're broken. Instead it 
turned into a discussion on why we should stick with the regression and
convince all of userspace to change and implement broken heuristics. I've
used up my time budget for that.

Cheers,
   Peter

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open"
  2017-05-26  7:39                     ` Peter Hutterer
@ 2017-05-27 19:23                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 38+ messages in thread
From: Rafael J. Wysocki @ 2017-05-27 19:23 UTC (permalink / raw)
  To: Peter Hutterer
  Cc: Benjamin Tissoires, Zheng, Lv, Jiri Eischmann, linux-acpi, linux-kernel

On Friday, May 26, 2017 05:39:53 PM Peter Hutterer wrote:
> On May 25 2017, Benjamin Tissoires wrote:
> > On May 15 2017 or thereabouts, Rafael J. Wysocki wrote:
> 
> > > >> >> Benjamin, my understanding is that this is the case, is it correct?
> > > >> >
> > > >> > That is correct. This patch I reverted introduces regression for professional
> > > >> > laptops that expect the LID switch to be reported accurately.
> > > >>
> > > >> And from a user's perspective, what does not work any more?
> > > >
> > > > If you boot or resume your laptop with the lid closed on a docking
> > > > station while using an external monitor connected to it, both internal
> > > > and external displays will light on, while only the external should.
> > > >
> > > > There is a design choice in gdm to only provide the greater on the
> > > > internal display when lit on, so users only see a gray area on the
> > > > external monitor. Also, the cursor will not show up as it's by default
> > > > on the internal display too.
> > > >
> > > > To "fix" that, users have to open the laptop once and close it once
> > > > again to sync the state of the switch with the hardware state.
> > >
> > > OK
> > >
> > > Yeah, that sucks.
> > >
> > > So without the Lv's patch the behavior (on the systems in question) is
> > > as expected, right?
> > 
> > Would you agree to take both these reverts without Lv's ACK? We already
> > tried to explain for 2 weeks that they are valuable, but it seems we
> > can't make change his mind.
> > 
> > I have more that 26 emails in my INBOX (not counting my replies) and I
> > would really like switching to more valuable work than explaining again
> > and again that when a regression is introduced, it needs to be fixed (or
> > reverted in that case).
> 
> Yes please. This should have stopped right after "regression on basically
> every decent laptop out there" and we should be discussing how to fix the
> devices that actually need quirks because they're broken. Instead it 
> turned into a discussion on why we should stick with the regression and
> convince all of userspace to change and implement broken heuristics. I've
> used up my time budget for that.

Appreciated.

Also please note that it actually might help to make the decision.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2017-05-27 19:30 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-10 16:12 [PATCH 0/2] acpi/button: revert v4.10 behavior Benjamin Tissoires
2017-05-10 16:12 ` [PATCH 1/2] Revert "ACPI / button: Remove lid_init_state=method mode" Benjamin Tissoires
2017-05-11  0:58   ` Zheng, Lv
2017-05-11  1:19     ` Zheng, Lv
2017-05-11 10:12       ` Benjamin Tissoires
2017-05-12  5:08         ` Zheng, Lv
2017-05-12  9:50           ` Benjamin Tissoires
2017-05-15  4:54             ` Zheng, Lv
2017-05-15  4:54               ` Zheng, Lv
2017-05-15  7:42               ` Benjamin Tissoires
2017-05-16  5:05                 ` Zheng, Lv
2017-05-10 16:12 ` [PATCH 2/2] Revert "ACPI / button: Change default behavior to lid_init_state=open" Benjamin Tissoires
2017-05-11  0:59   ` Zheng, Lv
2017-05-11  9:45     ` Benjamin Tissoires
2017-05-12  2:36       ` Zheng, Lv
2017-05-12 21:50         ` Rafael J. Wysocki
2017-05-15  7:45           ` Benjamin Tissoires
2017-05-15  9:17             ` Rafael J. Wysocki
2017-05-15  9:37               ` Benjamin Tissoires
2017-05-15 11:01                 ` Rafael J. Wysocki
2017-05-15 13:05                   ` Benjamin Tissoires
2017-05-16  5:33                     ` Zheng, Lv
2017-05-16  5:47                       ` Zheng, Lv
2017-05-16  5:47                         ` Zheng, Lv
2017-05-16  7:15                         ` Benjamin Tissoires
2017-05-16  8:30                           ` Zheng, Lv
2017-05-16 10:10                             ` Benjamin Tissoires
2017-05-17  7:32                               ` Zheng, Lv
2017-05-17  7:32                                 ` Zheng, Lv
2017-05-17 11:54                                 ` Peter Hutterer
2017-05-17 14:16                                 ` Benjamin Tissoires
2017-05-24  8:08                   ` Benjamin Tissoires
2017-05-24 23:01                     ` Rafael J. Wysocki
2017-05-25  6:17                       ` Zheng, Lv
2017-05-26  7:39                     ` Peter Hutterer
2017-05-27 19:23                       ` Rafael J. Wysocki
2017-05-11  0:09 ` [PATCH 0/2] acpi/button: revert v4.10 behavior Rafael J. Wysocki
2017-05-11  9:33   ` Benjamin Tissoires

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.